This notebook is a template with each step that you need to complete for the project.
Please fill in your code where there are explicit ? markers in the notebook. You are welcome to add more cells and code as you see fit.
Once you have completed all the code implementations, please export your notebook as a HTML file so the reviews can view your code. Make sure you have all outputs correctly outputted.
File-> Export Notebook As... -> Export Notebook as HTML
There is a writeup to complete as well after all code implememtation is done. Please answer all questions and attach the necessary tables and charts. You can complete the writeup in either markdown or PDF.
Completing the code template and writeup template will cover all of the rubric points for this project.
The rubric contains "Stand Out Suggestions" for enhancing the project beyond the minimum requirements. The stand out suggestions are optional. If you decide to pursue the "stand out suggestions", you can include the code in this notebook and also discuss the results in the writeup file.
Below is example of steps to get the API username and key. Each student will have their own username and key.
kaggle.json and use the username and key.
ml.t3.medium instance (2 vCPU + 4 GiB)Python 3 (MXNet 1.8 Python 3.7 CPU Optimized)!pip install -U pip
!pip install -U setuptools wheel
!pip install -U "mxnet<2.0.0" bokeh==2.0.1
!pip install autogluon --no-cache-dir
!pip install kaggle
# Without --no-cache-dir, smaller aws instances may have trouble installing
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (21.3.1)
Collecting pip
Using cached pip-22.3.1-py3-none-any.whl (2.1 MB)
Installing collected packages: pip
Attempting uninstall: pip
Found existing installation: pip 21.3.1
Uninstalling pip-21.3.1:
Successfully uninstalled pip-21.3.1
Successfully installed pip-22.3.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (59.4.0)
Collecting setuptools
Using cached setuptools-65.6.3-py3-none-any.whl (1.2 MB)
Collecting wheel
Using cached wheel-0.38.4-py3-none-any.whl (36 kB)
Installing collected packages: wheel, setuptools
Attempting uninstall: setuptools
Found existing installation: setuptools 59.4.0
Uninstalling setuptools-59.4.0:
Successfully uninstalled setuptools-59.4.0
Successfully installed setuptools-65.6.3 wheel-0.38.4
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting mxnet<2.0.0
Using cached mxnet-1.9.1-py3-none-manylinux2014_x86_64.whl (49.1 MB)
Collecting bokeh==2.0.1
Using cached bokeh-2.0.1-py3-none-any.whl
Requirement already satisfied: packaging>=16.8 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (21.3)
Requirement already satisfied: typing-extensions>=3.7.4 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (4.0.1)
Requirement already satisfied: numpy>=1.11.3 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (1.19.1)
Requirement already satisfied: Jinja2>=2.7 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (3.0.3)
Requirement already satisfied: PyYAML>=3.10 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (5.4.1)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (2.8.2)
Requirement already satisfied: pillow>=4.0 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (8.4.0)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from bokeh==2.0.1) (6.1)
Requirement already satisfied: requests<3,>=2.20.0 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (2.22.0)
Requirement already satisfied: graphviz<0.9.0,>=0.8.1 in /usr/local/lib/python3.7/site-packages (from mxnet<2.0.0) (0.8.4)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from Jinja2>=2.7->bokeh==2.0.1) (2.0.1)
Requirement already satisfied: pyparsing!=3.0.5,>=2.0.2 in /usr/local/lib/python3.7/site-packages (from packaging>=16.8->bokeh==2.0.1) (3.0.6)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/site-packages (from python-dateutil>=2.1->bokeh==2.0.1) (1.16.0)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2021.10.8)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (3.0.4)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/site-packages (from requests<3,>=2.20.0->mxnet<2.0.0) (1.25.11)
Installing collected packages: mxnet, bokeh
Attempting uninstall: bokeh
Found existing installation: bokeh 2.4.2
Uninstalling bokeh-2.4.2:
Successfully uninstalled bokeh-2.4.2
Successfully installed bokeh-2.0.1 mxnet-1.9.1
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting autogluon
Downloading autogluon-0.6.1-py3-none-any.whl (9.8 kB)
Collecting autogluon.tabular[all]==0.6.1
Downloading autogluon.tabular-0.6.1-py3-none-any.whl (286 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 286.0/286.0 kB 124.9 MB/s eta 0:00:00
Collecting autogluon.timeseries[all]==0.6.1
Downloading autogluon.timeseries-0.6.1-py3-none-any.whl (103 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 103.0/103.0 kB 198.1 MB/s eta 0:00:00
Collecting autogluon.features==0.6.1
Downloading autogluon.features-0.6.1-py3-none-any.whl (59 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.0/60.0 kB 172.7 MB/s eta 0:00:00
Collecting autogluon.core[all]==0.6.1
Downloading autogluon.core-0.6.1-py3-none-any.whl (226 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 226.6/226.6 kB 222.9 MB/s eta 0:00:00
Collecting autogluon.multimodal==0.6.1
Downloading autogluon.multimodal-0.6.1-py3-none-any.whl (289 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 289.7/289.7 kB 226.8 MB/s eta 0:00:00
Collecting autogluon.vision==0.6.1
Downloading autogluon.vision-0.6.1-py3-none-any.whl (49 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.8/49.8 kB 163.5 MB/s eta 0:00:00
Collecting autogluon.text==0.6.1
Downloading autogluon.text-0.6.1-py3-none-any.whl (62 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 62.1/62.1 kB 124.8 MB/s eta 0:00:00
Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (2.22.0)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (3.5.0)
Requirement already satisfied: scikit-learn<1.2,>=1.0.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (1.0.1)
Collecting dask<=2021.11.2,>=2021.09.1
Downloading dask-2021.11.2-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 239.0 MB/s eta 0:00:00
Requirement already satisfied: pandas!=1.4.0,<1.6,>=1.2.5 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (1.3.4)
Collecting scipy<1.10.0,>=1.5.4
Downloading scipy-1.7.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (38.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 38.1/38.1 MB 169.9 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: tqdm>=4.38.0 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (4.39.0)
Collecting distributed<=2021.11.2,>=2021.09.1
Downloading distributed-2021.11.2-py3-none-any.whl (802 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 802.2/802.2 kB 253.0 MB/s eta 0:00:00
Requirement already satisfied: boto3 in /usr/local/lib/python3.7/site-packages (from autogluon.core[all]==0.6.1->autogluon) (1.20.17)
Collecting autogluon.common==0.6.1
Downloading autogluon.common-0.6.1-py3-none-any.whl (41 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 41.5/41.5 kB 138.0 MB/s eta 0:00:00
Collecting numpy<1.24,>=1.21
Downloading numpy-1.21.6-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (15.7 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 15.7/15.7 MB 170.2 MB/s eta 0:00:00a 0:00:01
Collecting ray[tune]<2.1,>=2.0
Downloading ray-2.0.1-cp37-cp37m-manylinux2014_x86_64.whl (60.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 60.5/60.5 MB 165.5 MB/s eta 0:00:0000:0100:01
Collecting hyperopt<0.2.8,>=0.2.7
Downloading hyperopt-0.2.7-py2.py3-none-any.whl (1.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.6/1.6 MB 235.8 MB/s eta 0:00:00
Requirement already satisfied: psutil<6,>=5.7.3 in /usr/local/lib/python3.7/site-packages (from autogluon.features==0.6.1->autogluon) (5.8.0)
Collecting evaluate<=0.3.0
Downloading evaluate-0.3.0-py3-none-any.whl (72 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 72.9/72.9 kB 173.0 MB/s eta 0:00:00
Collecting transformers<4.24.0,>=4.23.0
Downloading transformers-4.23.1-py3-none-any.whl (5.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 5.3/5.3 MB 183.8 MB/s eta 0:00:00
Collecting nptyping<1.5.0,>=1.4.4
Downloading nptyping-1.4.4-py3-none-any.whl (31 kB)
Collecting jsonschema<=4.8.0
Downloading jsonschema-4.8.0-py3-none-any.whl (81 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 81.4/81.4 kB 203.0 MB/s eta 0:00:00
Collecting torchmetrics<0.9.0,>=0.8.0
Downloading torchmetrics-0.8.2-py3-none-any.whl (409 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 409.8/409.8 kB 169.6 MB/s eta 0:00:00
Collecting pytorch-metric-learning<1.4.0,>=1.3.0
Downloading pytorch_metric_learning-1.3.2-py3-none-any.whl (109 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 109.4/109.4 kB 192.2 MB/s eta 0:00:00
Collecting torch<1.13,>=1.9
Downloading torch-1.12.1-cp37-cp37m-manylinux1_x86_64.whl (776.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 776.3/776.3 MB 171.6 MB/s eta 0:00:0000:0100:01
Collecting timm<0.7.0
Downloading timm-0.6.12-py3-none-any.whl (549 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 549.1/549.1 kB 238.6 MB/s eta 0:00:00
Collecting nlpaug<=1.1.10,>=1.1.10
Downloading nlpaug-1.1.10-py3-none-any.whl (410 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 410.8/410.8 kB 235.6 MB/s eta 0:00:00
Collecting pytorch-lightning<1.8.0,>=1.7.4
Downloading pytorch_lightning-1.7.7-py3-none-any.whl (708 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 708.1/708.1 kB 242.0 MB/s eta 0:00:00
Collecting sentencepiece<0.2.0,>=0.1.95
Downloading sentencepiece-0.1.97-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 217.6 MB/s eta 0:00:00
Collecting fairscale<=0.4.6,>=0.4.5
Downloading fairscale-0.4.6.tar.gz (248 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 248.2/248.2 kB 221.5 MB/s eta 0:00:00
Installing build dependencies ... done
Getting requirements to build wheel ... done
Installing backend dependencies ... done
Preparing metadata (pyproject.toml) ... done
Collecting nltk<4.0.0,>=3.4.5
Downloading nltk-3.8-py3-none-any.whl (1.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.5/1.5 MB 227.1 MB/s eta 0:00:00
Collecting defusedxml<=0.7.1,>=0.7.1
Downloading defusedxml-0.7.1-py2.py3-none-any.whl (25 kB)
Collecting omegaconf<2.2.0,>=2.1.1
Downloading omegaconf-2.1.2-py3-none-any.whl (74 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 74.7/74.7 kB 191.1 MB/s eta 0:00:00
Collecting torchtext<0.14.0
Downloading torchtext-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (1.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.9/1.9 MB 248.0 MB/s eta 0:00:00
Collecting torchvision<0.14.0
Downloading torchvision-0.13.1-cp37-cp37m-manylinux1_x86_64.whl (19.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 19.1/19.1 MB 190.5 MB/s eta 0:00:00a 0:00:01
Collecting Pillow<=9.4.0,>=9.3.0
Downloading Pillow-9.3.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (3.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 3.2/3.2 MB 250.2 MB/s eta 0:00:00
Collecting seqeval<=1.2.2
Downloading seqeval-1.2.2.tar.gz (43 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.6/43.6 kB 147.1 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting scikit-image<0.20.0,>=0.19.1
Downloading scikit_image-0.19.3-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (13.5 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 13.5/13.5 MB 162.8 MB/s eta 0:00:00a 0:00:01
Collecting albumentations<=1.2.0,>=1.1.0
Downloading albumentations-1.2.0-py3-none-any.whl (113 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 113.5/113.5 kB 206.3 MB/s eta 0:00:00
Collecting text-unidecode<=1.3
Downloading text_unidecode-1.3-py2.py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.2/78.2 kB 192.0 MB/s eta 0:00:00
Collecting openmim<=0.2.1,>0.1.5
Downloading openmim-0.2.1-py2.py3-none-any.whl (49 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 49.7/49.7 kB 172.2 MB/s eta 0:00:00
Collecting accelerate<0.14,>=0.9
Downloading accelerate-0.13.2-py3-none-any.whl (148 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.8/148.8 kB 221.9 MB/s eta 0:00:00
Collecting smart-open<5.3.0,>=5.2.1
Downloading smart_open-5.2.1-py3-none-any.whl (58 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 58.6/58.6 kB 173.2 MB/s eta 0:00:00
Requirement already satisfied: networkx<3.0,>=2.3 in /usr/local/lib/python3.7/site-packages (from autogluon.tabular[all]==0.6.1->autogluon) (2.6.3)
Collecting fastai<2.8,>=2.3.1
Downloading fastai-2.7.10-py3-none-any.whl (240 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 240.9/240.9 kB 242.0 MB/s eta 0:00:00
Collecting lightgbm<3.4,>=3.3
Downloading lightgbm-3.3.3-py3-none-manylinux1_x86_64.whl (2.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 2.0/2.0 MB 257.8 MB/s eta 0:00:00
Collecting xgboost<1.8,>=1.6
Downloading xgboost-1.6.2-py3-none-manylinux2014_x86_64.whl (255.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 255.9/255.9 MB 179.4 MB/s eta 0:00:0000:0100:01
Collecting catboost<1.2,>=1.0
Downloading catboost-1.1.1-cp37-none-manylinux1_x86_64.whl (76.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 76.6/76.6 MB 185.4 MB/s eta 0:00:00a 0:00:01
Collecting statsmodels~=0.13.0
Downloading statsmodels-0.13.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 9.9/9.9 MB 195.5 MB/s eta 0:00:00a 0:00:01
Collecting gluonts~=0.11.0
Downloading gluonts-0.11.6-py3-none-any.whl (1.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.0/1.0 MB 251.1 MB/s eta 0:00:00
Requirement already satisfied: joblib~=1.1 in /usr/local/lib/python3.7/site-packages (from autogluon.timeseries[all]==0.6.1->autogluon) (1.1.0)
Collecting sktime<0.14,>=0.13.1
Downloading sktime-0.13.4-py3-none-any.whl (7.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.0/7.0 MB 180.1 MB/s eta 0:00:00a 0:00:01
Collecting pmdarima~=1.8.2
Downloading pmdarima-1.8.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.manylinux_2_24_x86_64.whl (1.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.4/1.4 MB 250.6 MB/s eta 0:00:00
Collecting tbats~=1.1
Downloading tbats-1.1.2-py3-none-any.whl (43 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 43.8/43.8 kB 143.5 MB/s eta 0:00:00
Collecting gluoncv<0.10.6,>=0.10.5
Downloading gluoncv-0.10.5.post0-py2.py3-none-any.whl (1.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 1.3/1.3 MB 258.1 MB/s eta 0:00:00
Requirement already satisfied: setuptools in /usr/local/lib/python3.7/site-packages (from autogluon.common==0.6.1->autogluon.core[all]==0.6.1->autogluon) (65.6.3)
Requirement already satisfied: packaging>=20.0 in /usr/local/lib/python3.7/site-packages (from accelerate<0.14,>=0.9->autogluon.multimodal==0.6.1->autogluon) (21.3)
Requirement already satisfied: pyyaml in /usr/local/lib/python3.7/site-packages (from accelerate<0.14,>=0.9->autogluon.multimodal==0.6.1->autogluon) (5.4.1)
Collecting albumentations<=1.2.0,>=1.1.0
Downloading albumentations-1.1.0-py3-none-any.whl (102 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 102.4/102.4 kB 202.0 MB/s eta 0:00:00
Collecting qudida>=0.0.4
Downloading qudida-0.0.4-py3-none-any.whl (3.5 kB)
Collecting opencv-python-headless>=4.1.1
Downloading opencv_python_headless-4.6.0.66-cp36-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (48.3 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.3/48.3 MB 177.0 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: graphviz in /usr/local/lib/python3.7/site-packages (from catboost<1.2,>=1.0->autogluon.tabular[all]==0.6.1->autogluon) (0.8.4)
Requirement already satisfied: plotly in /usr/local/lib/python3.7/site-packages (from catboost<1.2,>=1.0->autogluon.tabular[all]==0.6.1->autogluon) (5.4.0)
Requirement already satisfied: six in /usr/local/lib/python3.7/site-packages (from catboost<1.2,>=1.0->autogluon.tabular[all]==0.6.1->autogluon) (1.16.0)
Collecting toolz>=0.8.2
Downloading toolz-0.12.0-py3-none-any.whl (55 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 55.8/55.8 kB 154.0 MB/s eta 0:00:00
Requirement already satisfied: fsspec>=0.6.0 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.6.1->autogluon) (2021.11.1)
Requirement already satisfied: cloudpickle>=1.1.1 in /usr/local/lib/python3.7/site-packages (from dask<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.6.1->autogluon) (2.0.0)
Collecting partd>=0.3.10
Downloading partd-1.3.0-py3-none-any.whl (18 kB)
Collecting click>=6.6
Downloading click-8.1.3-py3-none-any.whl (96 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 96.6/96.6 kB 184.4 MB/s eta 0:00:00
Collecting msgpack>=0.6.0
Downloading msgpack-1.0.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (299 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 299.8/299.8 kB 227.4 MB/s eta 0:00:00
Collecting zict>=0.1.3
Downloading zict-2.2.0-py2.py3-none-any.whl (23 kB)
Requirement already satisfied: jinja2 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.6.1->autogluon) (3.0.3)
Collecting sortedcontainers!=2.0.0,!=2.0.1
Downloading sortedcontainers-2.4.0-py2.py3-none-any.whl (29 kB)
Collecting tblib>=1.6.0
Downloading tblib-1.7.0-py2.py3-none-any.whl (12 kB)
Requirement already satisfied: tornado>=5 in /usr/local/lib/python3.7/site-packages (from distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.6.1->autogluon) (6.1)
Collecting responses<0.19
Downloading responses-0.18.0-py3-none-any.whl (38 kB)
Collecting xxhash
Downloading xxhash-3.1.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (212 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 213.0/213.0 kB 226.0 MB/s eta 0:00:00
Requirement already satisfied: dill in /usr/local/lib/python3.7/site-packages (from evaluate<=0.3.0->autogluon.multimodal==0.6.1->autogluon) (0.3.4)
Collecting huggingface-hub>=0.7.0
Downloading huggingface_hub-0.11.1-py3-none-any.whl (182 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 182.4/182.4 kB 211.1 MB/s eta 0:00:00
Collecting tqdm>=4.38.0
Downloading tqdm-4.64.1-py2.py3-none-any.whl (78 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 78.5/78.5 kB 188.4 MB/s eta 0:00:00
Requirement already satisfied: importlib-metadata in /usr/local/lib/python3.7/site-packages (from evaluate<=0.3.0->autogluon.multimodal==0.6.1->autogluon) (4.8.2)
Requirement already satisfied: multiprocess in /usr/local/lib/python3.7/site-packages (from evaluate<=0.3.0->autogluon.multimodal==0.6.1->autogluon) (0.70.12.2)
Collecting datasets>=2.0.0
Downloading datasets-2.8.0-py3-none-any.whl (452 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 452.9/452.9 kB 233.9 MB/s eta 0:00:00
Requirement already satisfied: pip in /usr/local/lib/python3.7/site-packages (from fastai<2.8,>=2.3.1->autogluon.tabular[all]==0.6.1->autogluon) (22.3.1)
Collecting spacy<4
Downloading spacy-3.4.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 225.8 MB/s eta 0:00:00
Collecting fastprogress>=0.2.4
Downloading fastprogress-1.0.3-py3-none-any.whl (12 kB)
Collecting fastcore<1.6,>=1.4.5
Downloading fastcore-1.5.27-py3-none-any.whl (67 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 67.1/67.1 kB 106.1 MB/s eta 0:00:00
Collecting fastdownload<2,>=0.0.5
Downloading fastdownload-0.0.7-py3-none-any.whl (12 kB)
Collecting yacs
Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Requirement already satisfied: portalocker in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.6.1->autogluon) (2.3.2)
Requirement already satisfied: opencv-python in /usr/local/lib/python3.7/site-packages (from gluoncv<0.10.6,>=0.10.5->autogluon.vision==0.6.1->autogluon) (4.5.4.60)
Collecting autocfg
Downloading autocfg-0.0.8-py3-none-any.whl (13 kB)
Requirement already satisfied: typing-extensions~=4.0 in /usr/local/lib/python3.7/site-packages (from gluonts~=0.11.0->autogluon.timeseries[all]==0.6.1->autogluon) (4.0.1)
Collecting pydantic~=1.7
Downloading pydantic-1.10.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 11.8/11.8 MB 191.8 MB/s eta 0:00:00a 0:00:01
Collecting py4j
Downloading py4j-0.10.9.7-py2.py3-none-any.whl (200 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 200.5/200.5 kB 217.9 MB/s eta 0:00:00
Collecting future
Downloading future-0.18.2.tar.gz (829 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 829.2/829.2 kB 255.4 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting importlib-resources>=1.4.0
Downloading importlib_resources-5.10.1-py3-none-any.whl (34 kB)
Requirement already satisfied: attrs>=17.4.0 in /usr/local/lib/python3.7/site-packages (from jsonschema<=4.8.0->autogluon.multimodal==0.6.1->autogluon) (21.2.0)
Collecting pyrsistent!=0.17.0,!=0.17.1,!=0.17.2,>=0.14.0
Downloading pyrsistent-0.19.2-py3-none-any.whl (57 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 57.5/57.5 kB 167.2 MB/s eta 0:00:00
Requirement already satisfied: wheel in /usr/local/lib/python3.7/site-packages (from lightgbm<3.4,>=3.3->autogluon.tabular[all]==0.6.1->autogluon) (0.38.4)
Collecting regex>=2021.8.3
Downloading regex-2022.10.31-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (757 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 757.1/757.1 kB 250.5 MB/s eta 0:00:00
Collecting typish>=1.7.0
Downloading typish-1.9.3-py3-none-any.whl (45 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 45.1/45.1 kB 142.6 MB/s eta 0:00:00
Collecting antlr4-python3-runtime==4.8
Downloading antlr4-python3-runtime-4.8.tar.gz (112 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 112.4/112.4 kB 191.7 MB/s eta 0:00:00
Preparing metadata (setup.py) ... done
Collecting rich
Downloading rich-12.6.0-py3-none-any.whl (237 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 237.5/237.5 kB 222.5 MB/s eta 0:00:00
Collecting model-index
Downloading model_index-0.1.11-py3-none-any.whl (34 kB)
Requirement already satisfied: colorama in /usr/local/lib/python3.7/site-packages (from openmim<=0.2.1,>0.1.5->autogluon.multimodal==0.6.1->autogluon) (0.4.3)
Requirement already satisfied: tabulate in /usr/local/lib/python3.7/site-packages (from openmim<=0.2.1,>0.1.5->autogluon.multimodal==0.6.1->autogluon) (0.8.9)
Requirement already satisfied: pytz>=2017.3 in /usr/local/lib/python3.7/site-packages (from pandas!=1.4.0,<1.6,>=1.2.5->autogluon.core[all]==0.6.1->autogluon) (2021.3)
Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.7/site-packages (from pandas!=1.4.0,<1.6,>=1.2.5->autogluon.core[all]==0.6.1->autogluon) (2.8.2)
Requirement already satisfied: Cython!=0.29.18,>=0.29 in /usr/local/lib/python3.7/site-packages (from pmdarima~=1.8.2->autogluon.timeseries[all]==0.6.1->autogluon) (0.29.24)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/site-packages (from pmdarima~=1.8.2->autogluon.timeseries[all]==0.6.1->autogluon) (1.25.11)
Collecting pyDeprecate>=0.3.1
Downloading pyDeprecate-0.3.2-py3-none-any.whl (10 kB)
Collecting tensorboard>=2.9.1
Downloading tensorboard-2.11.0-py3-none-any.whl (6.0 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.0/6.0 MB 216.5 MB/s eta 0:00:00
Collecting virtualenv
Downloading virtualenv-20.17.1-py3-none-any.whl (8.8 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 8.8/8.8 MB 182.2 MB/s eta 0:00:00a 0:00:01
Collecting grpcio<=1.43.0,>=1.32.0
Downloading grpcio-1.43.0-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (4.1 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.1/4.1 MB 167.7 MB/s eta 0:00:00
Collecting frozenlist
Downloading frozenlist-1.3.3-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (148 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 148.0/148.0 kB 214.2 MB/s eta 0:00:00
Collecting aiosignal
Downloading aiosignal-1.3.1-py3-none-any.whl (7.6 kB)
Collecting click>=6.6
Downloading click-8.0.4-py3-none-any.whl (97 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 97.5/97.5 kB 87.3 MB/s eta 0:00:00
Collecting filelock
Downloading filelock-3.8.2-py3-none-any.whl (10 kB)
Requirement already satisfied: protobuf<4.0.0,>=3.15.3 in /usr/local/lib/python3.7/site-packages (from ray[tune]<2.1,>=2.0->autogluon.core[all]==0.6.1->autogluon) (3.19.1)
Collecting tensorboardX>=1.9
Downloading tensorboardX-2.5.1-py2.py3-none-any.whl (125 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 125.4/125.4 kB 218.6 MB/s eta 0:00:00
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.6.1->autogluon) (2021.10.8)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.6.1->autogluon) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->autogluon.core[all]==0.6.1->autogluon) (3.0.4)
Collecting tifffile>=2019.7.26
Downloading tifffile-2021.11.2-py3-none-any.whl (178 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 178.9/178.9 kB 225.2 MB/s eta 0:00:00
Requirement already satisfied: imageio>=2.4.1 in /usr/local/lib/python3.7/site-packages (from scikit-image<0.20.0,>=0.19.1->autogluon.multimodal==0.6.1->autogluon) (2.13.1)
Collecting PyWavelets>=1.1.1
Downloading PyWavelets-1.3.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (6.4 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 6.4/6.4 MB 207.4 MB/s eta 0:00:00
Requirement already satisfied: threadpoolctl>=2.0.0 in /usr/local/lib/python3.7/site-packages (from scikit-learn<1.2,>=1.0.0->autogluon.core[all]==0.6.1->autogluon) (3.0.0)
Collecting deprecated>=1.2.13
Downloading Deprecated-1.2.13-py2.py3-none-any.whl (9.6 kB)
Requirement already satisfied: numba>=0.53 in /usr/local/lib/python3.7/site-packages (from sktime<0.14,>=0.13.1->autogluon.timeseries[all]==0.6.1->autogluon) (0.53.1)
Collecting patsy>=0.5.2
Downloading patsy-0.5.3-py2.py3-none-any.whl (233 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 233.8/233.8 kB 240.6 MB/s eta 0:00:00
Collecting tokenizers!=0.11.3,<0.14,>=0.11.1
Downloading tokenizers-0.13.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (7.6 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 7.6/7.6 MB 177.3 MB/s eta 0:00:00a 0:00:01
Requirement already satisfied: s3transfer<0.6.0,>=0.5.0 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.6.1->autogluon) (0.5.0)
Requirement already satisfied: botocore<1.24.0,>=1.23.17 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.6.1->autogluon) (1.23.17)
Requirement already satisfied: jmespath<1.0.0,>=0.7.1 in /usr/local/lib/python3.7/site-packages (from boto3->autogluon.core[all]==0.6.1->autogluon) (0.10.0)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.6.1->autogluon) (1.3.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.6.1->autogluon) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.6.1->autogluon) (4.28.2)
Requirement already satisfied: setuptools-scm>=4 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.6.1->autogluon) (6.3.2)
Requirement already satisfied: pyparsing>=2.2.1 in /usr/local/lib/python3.7/site-packages (from matplotlib->autogluon.core[all]==0.6.1->autogluon) (3.0.6)
Requirement already satisfied: pyarrow>=6.0.0 in /usr/local/lib/python3.7/site-packages (from datasets>=2.0.0->evaluate<=0.3.0->autogluon.multimodal==0.6.1->autogluon) (6.0.1)
Collecting aiohttp
Downloading aiohttp-3.8.3-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (948 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 948.0/948.0 kB 258.6 MB/s eta 0:00:00
Collecting wrapt<2,>=1.10
Downloading wrapt-1.14.1-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (75 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 75.2/75.2 kB 190.7 MB/s eta 0:00:00
Requirement already satisfied: zipp>=3.1.0 in /usr/local/lib/python3.7/site-packages (from importlib-resources>=1.4.0->jsonschema<=4.8.0->autogluon.multimodal==0.6.1->autogluon) (3.6.0)
Requirement already satisfied: llvmlite<0.37,>=0.36.0rc1 in /usr/local/lib/python3.7/site-packages (from numba>=0.53->sktime<0.14,>=0.13.1->autogluon.timeseries[all]==0.6.1->autogluon) (0.36.0)
Collecting locket
Downloading locket-1.0.0-py2.py3-none-any.whl (4.4 kB)
Collecting typing-extensions~=4.0
Downloading typing_extensions-4.4.0-py3-none-any.whl (26 kB)
Requirement already satisfied: tomli>=1.0.0 in /usr/local/lib/python3.7/site-packages (from setuptools-scm>=4->matplotlib->autogluon.core[all]==0.6.1->autogluon) (1.2.2)
Collecting thinc<8.2.0,>=8.1.0
Downloading thinc-8.1.6-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (814 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 814.4/814.4 kB 257.8 MB/s eta 0:00:00
Collecting catalogue<2.1.0,>=2.0.6
Downloading catalogue-2.0.8-py3-none-any.whl (17 kB)
Collecting srsly<3.0.0,>=2.4.3
Downloading srsly-2.4.5-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (490 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 490.0/490.0 kB 245.3 MB/s eta 0:00:00
Collecting wasabi<1.1.0,>=0.9.1
Downloading wasabi-0.10.1-py3-none-any.whl (26 kB)
Collecting spacy-loggers<2.0.0,>=1.0.0
Downloading spacy_loggers-1.0.4-py3-none-any.whl (11 kB)
Collecting murmurhash<1.1.0,>=0.28.0
Downloading murmurhash-1.0.9-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (21 kB)
Collecting spacy-legacy<3.1.0,>=3.0.10
Downloading spacy_legacy-3.0.10-py2.py3-none-any.whl (21 kB)
Collecting typing-extensions~=4.0
Downloading typing_extensions-4.1.1-py3-none-any.whl (26 kB)
Collecting cymem<2.1.0,>=2.0.2
Downloading cymem-2.0.7-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (36 kB)
Collecting langcodes<4.0.0,>=3.2.0
Downloading langcodes-3.3.0-py3-none-any.whl (181 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 181.6/181.6 kB 228.7 MB/s eta 0:00:00
Collecting typer<0.8.0,>=0.3.0
Downloading typer-0.7.0-py3-none-any.whl (38 kB)
Collecting pathy>=0.3.5
Downloading pathy-0.10.1-py3-none-any.whl (48 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 48.9/48.9 kB 155.2 MB/s eta 0:00:00
Collecting preshed<3.1.0,>=3.0.2
Downloading preshed-3.0.8-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_17_x86_64.manylinux2014_x86_64.whl (126 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.6/126.6 kB 191.5 MB/s eta 0:00:00
Collecting absl-py>=0.4
Downloading absl_py-1.3.0-py3-none-any.whl (124 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 124.6/124.6 kB 201.4 MB/s eta 0:00:00
Collecting google-auth-oauthlib<0.5,>=0.4.1
Downloading google_auth_oauthlib-0.4.6-py2.py3-none-any.whl (18 kB)
Collecting tensorboard-data-server<0.7.0,>=0.6.0
Downloading tensorboard_data_server-0.6.1-py3-none-manylinux2010_x86_64.whl (4.9 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 4.9/4.9 MB 236.9 MB/s eta 0:00:00
Collecting markdown>=2.6.8
Downloading Markdown-3.4.1-py3-none-any.whl (93 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 93.3/93.3 kB 197.1 MB/s eta 0:00:00
Collecting google-auth<3,>=1.6.3
Downloading google_auth-2.15.0-py2.py3-none-any.whl (177 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 177.0/177.0 kB 215.2 MB/s eta 0:00:00
Collecting tensorboard-plugin-wit>=1.6.0
Downloading tensorboard_plugin_wit-1.8.1-py3-none-any.whl (781 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 781.3/781.3 kB 260.7 MB/s eta 0:00:00
Requirement already satisfied: werkzeug>=1.0.1 in /usr/local/lib/python3.7/site-packages (from tensorboard>=2.9.1->pytorch-lightning<1.8.0,>=1.7.4->autogluon.multimodal==0.6.1->autogluon) (2.0.2)
Collecting heapdict
Downloading HeapDict-1.0.1-py3-none-any.whl (3.9 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /usr/local/lib/python3.7/site-packages (from jinja2->distributed<=2021.11.2,>=2021.09.1->autogluon.core[all]==0.6.1->autogluon) (2.0.1)
Collecting ordered-set
Downloading ordered_set-4.1.0-py3-none-any.whl (7.6 kB)
Requirement already satisfied: tenacity>=6.2.0 in /usr/local/lib/python3.7/site-packages (from plotly->catboost<1.2,>=1.0->autogluon.tabular[all]==0.6.1->autogluon) (8.0.1)
Collecting commonmark<0.10.0,>=0.9.0
Downloading commonmark-0.9.1-py2.py3-none-any.whl (51 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 51.1/51.1 kB 167.6 MB/s eta 0:00:00
Requirement already satisfied: pygments<3.0.0,>=2.6.0 in /usr/local/lib/python3.7/site-packages (from rich->openmim<=0.2.1,>0.1.5->autogluon.multimodal==0.6.1->autogluon) (2.13.0)
Collecting importlib-metadata
Downloading importlib_metadata-5.2.0-py3-none-any.whl (21 kB)
Collecting distlib<1,>=0.3.6
Downloading distlib-0.3.6-py2.py3-none-any.whl (468 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 468.5/468.5 kB 250.9 MB/s eta 0:00:00
Collecting platformdirs<3,>=2.4
Downloading platformdirs-2.6.0-py3-none-any.whl (14 kB)
Collecting cachetools<6.0,>=2.0.0
Downloading cachetools-5.2.0-py3-none-any.whl (9.3 kB)
Collecting pyasn1-modules>=0.2.1
Downloading pyasn1_modules-0.2.8-py2.py3-none-any.whl (155 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 155.3/155.3 kB 202.7 MB/s eta 0:00:00
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/site-packages (from google-auth<3,>=1.6.3->tensorboard>=2.9.1->pytorch-lightning<1.8.0,>=1.7.4->autogluon.multimodal==0.6.1->autogluon) (4.7.2)
Collecting requests-oauthlib>=0.7.0
Downloading requests_oauthlib-1.3.1-py2.py3-none-any.whl (23 kB)
Collecting confection<1.0.0,>=0.0.1
Downloading confection-0.0.3-py3-none-any.whl (32 kB)
Collecting blis<0.8.0,>=0.7.8
Downloading blis-0.7.9-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.2 MB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 10.2/10.2 MB 186.6 MB/s eta 0:00:00a 0:00:01
Collecting charset-normalizer<3.0,>=2.0
Downloading charset_normalizer-2.1.1-py3-none-any.whl (39 kB)
Collecting async-timeout<5.0,>=4.0.0a3
Downloading async_timeout-4.0.2-py3-none-any.whl (5.8 kB)
Collecting asynctest==0.13.0
Downloading asynctest-0.13.0-py3-none-any.whl (26 kB)
Collecting multidict<7.0,>=4.5
Downloading multidict-6.0.4-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (94 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 94.8/94.8 kB 199.1 MB/s eta 0:00:00
Collecting yarl<2.0,>=1.0
Downloading yarl-1.8.2-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (231 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 231.4/231.4 kB 129.9 MB/s eta 0:00:00
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard>=2.9.1->pytorch-lightning<1.8.0,>=1.7.4->autogluon.multimodal==0.6.1->autogluon) (0.4.8)
Collecting oauthlib>=3.0.0
Downloading oauthlib-3.2.2-py3-none-any.whl (151 kB)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 151.7/151.7 kB 219.0 MB/s eta 0:00:00
Building wheels for collected packages: fairscale, antlr4-python3-runtime, seqeval, future
Building wheel for fairscale (pyproject.toml) ... done
Created wheel for fairscale: filename=fairscale-0.4.6-py3-none-any.whl size=307224 sha256=9157cf5d2ff034f08cae88843ad0ab1dae668f4e938d02a1e2295873471fdef2
Stored in directory: /tmp/pip-ephem-wheel-cache-t85xghkj/wheels/0b/8c/fa/a9e102632bcb86e919561cf25ca1e0dd2ec67476f3a5544653
Building wheel for antlr4-python3-runtime (setup.py) ... done
Created wheel for antlr4-python3-runtime: filename=antlr4_python3_runtime-4.8-py3-none-any.whl size=141211 sha256=b7a63e66b025db1bd9fa16f79efb0b18f6bc3203427222d3f54b51c84c1af958
Stored in directory: /tmp/pip-ephem-wheel-cache-t85xghkj/wheels/c9/ef/75/1b8c6588a8a8a15d5a9136608a9d65172a226577e7ae89da31
Building wheel for seqeval (setup.py) ... done
Created wheel for seqeval: filename=seqeval-1.2.2-py3-none-any.whl size=16164 sha256=6de735f54f0ddad532177db0b730f222ca2c078f99490dbfb5d64386407c20c5
Stored in directory: /tmp/pip-ephem-wheel-cache-t85xghkj/wheels/b2/a1/b7/0d3b008d0c77cd57332d724b92cf7650b4185b493dc785f00a
Building wheel for future (setup.py) ... done
Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491059 sha256=fbb9e04886eda1ae16c530d695b7710ce16642aba4b9dc783942c331c6f2cf0c
Stored in directory: /tmp/pip-ephem-wheel-cache-t85xghkj/wheels/3e/3c/b4/7132d27620dd551cf00823f798a7190e7320ae7ffb71d1e989
Successfully built fairscale antlr4-python3-runtime seqeval future
Installing collected packages: wasabi, typish, tokenizers, text-unidecode, tensorboard-plugin-wit, sortedcontainers, sentencepiece, py4j, msgpack, heapdict, distlib, cymem, commonmark, antlr4-python3-runtime, zict, yacs, xxhash, wrapt, typing-extensions, tqdm, toolz, tensorboard-data-server, tblib, spacy-loggers, spacy-legacy, smart-open, regex, pyrsistent, pyDeprecate, pyasn1-modules, platformdirs, Pillow, ordered-set, omegaconf, oauthlib, numpy, murmurhash, multidict, locket, langcodes, importlib-resources, grpcio, future, frozenlist, filelock, fastprogress, defusedxml, charset-normalizer, cachetools, autocfg, asynctest, absl-py, yarl, torch, tifffile, tensorboardX, scipy, rich, responses, requests-oauthlib, PyWavelets, pydantic, preshed, patsy, partd, opencv-python-headless, nptyping, importlib-metadata, google-auth, fastcore, deprecated, catalogue, blis, async-timeout, aiosignal, xgboost, virtualenv, torchvision, torchtext, torchmetrics, statsmodels, srsly, scikit-image, nlpaug, markdown, jsonschema, hyperopt, huggingface-hub, google-auth-oauthlib, gluonts, fastdownload, fairscale, dask, click, aiohttp, accelerate, typer, transformers, timm, tensorboard, sktime, seqeval, ray, qudida, pytorch-metric-learning, pmdarima, nltk, model-index, lightgbm, gluoncv, distributed, confection, catboost, thinc, tbats, pytorch-lightning, pathy, openmim, datasets, autogluon.common, albumentations, spacy, evaluate, autogluon.features, autogluon.core, fastai, autogluon.tabular, autogluon.multimodal, autogluon.vision, autogluon.timeseries, autogluon.text, autogluon
Attempting uninstall: typing-extensions
Found existing installation: typing_extensions 4.0.1
Uninstalling typing_extensions-4.0.1:
Successfully uninstalled typing_extensions-4.0.1
Attempting uninstall: tqdm
Found existing installation: tqdm 4.39.0
Uninstalling tqdm-4.39.0:
Successfully uninstalled tqdm-4.39.0
Attempting uninstall: Pillow
Found existing installation: Pillow 8.4.0
Uninstalling Pillow-8.4.0:
Successfully uninstalled Pillow-8.4.0
Attempting uninstall: numpy
Found existing installation: numpy 1.19.1
Uninstalling numpy-1.19.1:
Successfully uninstalled numpy-1.19.1
Attempting uninstall: scipy
Found existing installation: scipy 1.4.1
Uninstalling scipy-1.4.1:
Successfully uninstalled scipy-1.4.1
Attempting uninstall: importlib-metadata
Found existing installation: importlib-metadata 4.8.2
Uninstalling importlib-metadata-4.8.2:
Successfully uninstalled importlib-metadata-4.8.2
Attempting uninstall: gluoncv
Found existing installation: gluoncv 0.8.0
Uninstalling gluoncv-0.8.0:
Successfully uninstalled gluoncv-0.8.0
Successfully installed Pillow-9.3.0 PyWavelets-1.3.0 absl-py-1.3.0 accelerate-0.13.2 aiohttp-3.8.3 aiosignal-1.3.1 albumentations-1.1.0 antlr4-python3-runtime-4.8 async-timeout-4.0.2 asynctest-0.13.0 autocfg-0.0.8 autogluon-0.6.1 autogluon.common-0.6.1 autogluon.core-0.6.1 autogluon.features-0.6.1 autogluon.multimodal-0.6.1 autogluon.tabular-0.6.1 autogluon.text-0.6.1 autogluon.timeseries-0.6.1 autogluon.vision-0.6.1 blis-0.7.9 cachetools-5.2.0 catalogue-2.0.8 catboost-1.1.1 charset-normalizer-2.1.1 click-8.0.4 commonmark-0.9.1 confection-0.0.3 cymem-2.0.7 dask-2021.11.2 datasets-2.8.0 defusedxml-0.7.1 deprecated-1.2.13 distlib-0.3.6 distributed-2021.11.2 evaluate-0.3.0 fairscale-0.4.6 fastai-2.7.10 fastcore-1.5.27 fastdownload-0.0.7 fastprogress-1.0.3 filelock-3.8.2 frozenlist-1.3.3 future-0.18.2 gluoncv-0.10.5.post0 gluonts-0.11.6 google-auth-2.15.0 google-auth-oauthlib-0.4.6 grpcio-1.43.0 heapdict-1.0.1 huggingface-hub-0.11.1 hyperopt-0.2.7 importlib-metadata-5.2.0 importlib-resources-5.10.1 jsonschema-4.8.0 langcodes-3.3.0 lightgbm-3.3.3 locket-1.0.0 markdown-3.4.1 model-index-0.1.11 msgpack-1.0.4 multidict-6.0.4 murmurhash-1.0.9 nlpaug-1.1.10 nltk-3.8 nptyping-1.4.4 numpy-1.21.6 oauthlib-3.2.2 omegaconf-2.1.2 opencv-python-headless-4.6.0.66 openmim-0.2.1 ordered-set-4.1.0 partd-1.3.0 pathy-0.10.1 patsy-0.5.3 platformdirs-2.6.0 pmdarima-1.8.5 preshed-3.0.8 py4j-0.10.9.7 pyDeprecate-0.3.2 pyasn1-modules-0.2.8 pydantic-1.10.2 pyrsistent-0.19.2 pytorch-lightning-1.7.7 pytorch-metric-learning-1.3.2 qudida-0.0.4 ray-2.0.1 regex-2022.10.31 requests-oauthlib-1.3.1 responses-0.18.0 rich-12.6.0 scikit-image-0.19.3 scipy-1.7.3 sentencepiece-0.1.97 seqeval-1.2.2 sktime-0.13.4 smart-open-5.2.1 sortedcontainers-2.4.0 spacy-3.4.4 spacy-legacy-3.0.10 spacy-loggers-1.0.4 srsly-2.4.5 statsmodels-0.13.5 tbats-1.1.2 tblib-1.7.0 tensorboard-2.11.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 tensorboardX-2.5.1 text-unidecode-1.3 thinc-8.1.6 tifffile-2021.11.2 timm-0.6.12 tokenizers-0.13.2 toolz-0.12.0 torch-1.12.1 torchmetrics-0.8.2 torchtext-0.13.1 torchvision-0.13.1 tqdm-4.64.1 transformers-4.23.1 typer-0.7.0 typing-extensions-4.1.1 typish-1.9.3 virtualenv-20.17.1 wasabi-0.10.1 wrapt-1.14.1 xgboost-1.6.2 xxhash-3.1.0 yacs-0.1.8 yarl-1.8.2 zict-2.2.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
Collecting kaggle
Using cached kaggle-1.5.12-py3-none-any.whl
Collecting python-slugify
Using cached python_slugify-7.0.0-py2.py3-none-any.whl (9.4 kB)
Requirement already satisfied: tqdm in /usr/local/lib/python3.7/site-packages (from kaggle) (4.64.1)
Requirement already satisfied: certifi in /usr/local/lib/python3.7/site-packages (from kaggle) (2021.10.8)
Requirement already satisfied: python-dateutil in /usr/local/lib/python3.7/site-packages (from kaggle) (2.8.2)
Requirement already satisfied: requests in /usr/local/lib/python3.7/site-packages (from kaggle) (2.22.0)
Requirement already satisfied: six>=1.10 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.16.0)
Requirement already satisfied: urllib3 in /usr/local/lib/python3.7/site-packages (from kaggle) (1.25.11)
Requirement already satisfied: text-unidecode>=1.3 in /usr/local/lib/python3.7/site-packages (from python-slugify->kaggle) (1.3)
Requirement already satisfied: idna<2.9,>=2.5 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (2.8)
Requirement already satisfied: chardet<3.1.0,>=3.0.2 in /usr/local/lib/python3.7/site-packages (from requests->kaggle) (3.0.4)
Installing collected packages: python-slugify, kaggle
Successfully installed kaggle-1.5.12 python-slugify-7.0.0
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
# create the .kaggle directory and an empty kaggle.json file
!mkdir -p /root/.kaggle
!touch /root/.kaggle/kaggle.json
!chmod 600 /root/.kaggle/kaggle.json
# !mkdir -p /Users/narmina/.kaggle
# !touch /Users/narmina/.kaggle/kaggle.json
# !chmod 600 /Users/narmina/.kaggle/kaggle.json
# Fill in your user name and key from creating the kaggle account and API token file
import json
kaggle_username = "nayayyc"
kaggle_key = "fa9bd198433174b4a6edb2f7620a6e62"
# Save API token the kaggle.json file
# with open("/Users/narmina/.kaggle/kaggle.json", "w") as f:
with open("/root/.kaggle/kaggle.json", "w") as f:
f.write(json.dumps({"username": kaggle_username, "key": kaggle_key}))
# Download the dataset, it will be in a .zip file so you'll need to unzip it as well.
!kaggle competitions download -c bike-sharing-demand
# If you already downloaded it you can use the -o command to overwrite the file
!unzip -o bike-sharing-demand.zip
Downloading bike-sharing-demand.zip to /root/aws_mle_nanodegree/project_1 0%| | 0.00/189k [00:00<?, ?B/s] 100%|████████████████████████████████████████| 189k/189k [00:00<00:00, 7.05MB/s] Archive: bike-sharing-demand.zip inflating: sampleSubmission.csv inflating: test.csv inflating: train.csv
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from autogluon.tabular import TabularPredictor
/usr/local/lib/python3.7/site-packages/tqdm/auto.py:22: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html from .autonotebook import tqdm as notebook_tqdm
# Create the train dataset in pandas by reading the csv
# Set the parsing of the datetime column so you can use some of the `dt` features in pandas later
train = pd.read_csv("train.csv")
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 3 | 13 | 16 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 8 | 32 | 40 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 5 | 27 | 32 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 3 | 10 | 13 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 0 | 1 | 1 |
train.columns
Index(['datetime', 'season', 'holiday', 'workingday', 'weather', 'temp',
'atemp', 'humidity', 'windspeed', 'casual', 'registered', 'count'],
dtype='object')
# Simple output of the train dataset to view some of the min/max/varition of the dataset features.
train.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | casual | registered | count | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| count | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.00000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 | 10886.000000 |
| mean | 2.506614 | 0.028569 | 0.680875 | 1.418427 | 20.23086 | 23.655084 | 61.886460 | 12.799395 | 36.021955 | 155.552177 | 191.574132 |
| std | 1.116174 | 0.166599 | 0.466159 | 0.633839 | 7.79159 | 8.474601 | 19.245033 | 8.164537 | 49.960477 | 151.039033 | 181.144454 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.82000 | 0.760000 | 0.000000 | 0.000000 | 0.000000 | 0.000000 | 1.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.94000 | 16.665000 | 47.000000 | 7.001500 | 4.000000 | 36.000000 | 42.000000 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 20.50000 | 24.240000 | 62.000000 | 12.998000 | 17.000000 | 118.000000 | 145.000000 |
| 75% | 4.000000 | 0.000000 | 1.000000 | 2.000000 | 26.24000 | 31.060000 | 77.000000 | 16.997900 | 49.000000 | 222.000000 | 284.000000 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 41.00000 | 45.455000 | 100.000000 | 56.996900 | 367.000000 | 886.000000 | 977.000000 |
train.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 10886 entries, 0 to 10885 Data columns (total 12 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 datetime 10886 non-null object 1 season 10886 non-null int64 2 holiday 10886 non-null int64 3 workingday 10886 non-null int64 4 weather 10886 non-null int64 5 temp 10886 non-null float64 6 atemp 10886 non-null float64 7 humidity 10886 non-null int64 8 windspeed 10886 non-null float64 9 casual 10886 non-null int64 10 registered 10886 non-null int64 11 count 10886 non-null int64 dtypes: float64(3), int64(8), object(1) memory usage: 1020.7+ KB
# Create the test pandas dataframe in pandas by reading the csv, remember to parse the datetime!
test = pd.read_csv("test.csv")
test.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-20 00:00:00 | 1 | 0 | 1 | 1 | 10.66 | 11.365 | 56 | 26.0027 |
| 1 | 2011-01-20 01:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 2 | 2011-01-20 02:00:00 | 1 | 0 | 1 | 1 | 10.66 | 13.635 | 56 | 0.0000 |
| 3 | 2011-01-20 03:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
| 4 | 2011-01-20 04:00:00 | 1 | 0 | 1 | 1 | 10.66 | 12.880 | 56 | 11.0014 |
test.columns
Index(['datetime', 'season', 'holiday', 'workingday', 'weather', 'temp',
'atemp', 'humidity', 'windspeed'],
dtype='object')
test.describe()
| season | holiday | workingday | weather | temp | atemp | humidity | windspeed | |
|---|---|---|---|---|---|---|---|---|
| count | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 | 6493.000000 |
| mean | 2.493300 | 0.029108 | 0.685815 | 1.436778 | 20.620607 | 24.012865 | 64.125212 | 12.631157 |
| std | 1.091258 | 0.168123 | 0.464226 | 0.648390 | 8.059583 | 8.782741 | 19.293391 | 8.250151 |
| min | 1.000000 | 0.000000 | 0.000000 | 1.000000 | 0.820000 | 0.000000 | 16.000000 | 0.000000 |
| 25% | 2.000000 | 0.000000 | 0.000000 | 1.000000 | 13.940000 | 16.665000 | 49.000000 | 7.001500 |
| 50% | 3.000000 | 0.000000 | 1.000000 | 1.000000 | 21.320000 | 25.000000 | 65.000000 | 11.001400 |
| 75% | 3.000000 | 0.000000 | 1.000000 | 2.000000 | 27.060000 | 31.060000 | 81.000000 | 16.997900 |
| max | 4.000000 | 1.000000 | 1.000000 | 4.000000 | 40.180000 | 50.000000 | 100.000000 | 55.998600 |
# Same thing as train and test dataset
submission = pd.read_csv("sampleSubmission.csv")
submission.head()
| datetime | count | |
|---|---|---|
| 0 | 2011-01-20 00:00:00 | 0 |
| 1 | 2011-01-20 01:00:00 | 0 |
| 2 | 2011-01-20 02:00:00 | 0 |
| 3 | 2011-01-20 03:00:00 | 0 |
| 4 | 2011-01-20 04:00:00 | 0 |
submission.describe()
| count | |
|---|---|
| count | 6493.0 |
| mean | 0.0 |
| std | 0.0 |
| min | 0.0 |
| 25% | 0.0 |
| 50% | 0.0 |
| 75% | 0.0 |
| max | 0.0 |
Requirements:
count, so it is the label we are setting.casual and registered columns as they are also not present in the test dataset. root_mean_squared_error as the metric to use for evaluation.best_quality to focus on creating the best model.# Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
## I will be dropping these two columns as they are not present in test dataset instead of ignoring columns.
train.drop(columns=['casual','registered'], inplace = True )
predictor = TabularPredictor(
label='count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train,
# ignored_columns = ['casual','registered'],
time_limit = 600,
presets='best_quality'
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_115948/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_115948/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 9
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 6923.01 MB
Train Data (Original) Memory Usage: 1.52 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 5 | ['season', 'holiday', 'workingday', 'weather', 'humidity']
('object', ['datetime_as_object']) : 1 | ['datetime']
Types of features in processed data (raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.3s = Fit runtime
9 features in original data used to generate 13 features in processed data.
Train Data (Processed) Memory Usage: 0.98 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.38s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.64s of the 599.61s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.03s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 396.45s of the 596.42s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.03s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 396.08s of the 596.05s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-131.4609 = Validation score (-root_mean_squared_error)
65.35s = Training runtime
6.55s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 319.99s of the 519.96s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-131.0542 = Validation score (-root_mean_squared_error)
30.42s = Training runtime
1.47s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 284.8s of the 484.77s of remaining time.
-116.5443 = Validation score (-root_mean_squared_error)
10.91s = Training runtime
0.57s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 270.57s of the 470.53s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-130.5332 = Validation score (-root_mean_squared_error)
201.81s = Training runtime
0.18s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 64.79s of the 264.76s of remaining time.
-124.5881 = Validation score (-root_mean_squared_error)
5.16s = Training runtime
0.55s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 56.2s of the 256.16s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-138.3722 = Validation score (-root_mean_squared_error)
71.59s = Training runtime
0.41s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 179.16s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.53s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 178.55s of the 178.52s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-60.3946 = Validation score (-root_mean_squared_error)
55.85s = Training runtime
3.75s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 117.33s of the 117.3s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.2179 = Validation score (-root_mean_squared_error)
25.77s = Training runtime
0.22s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 86.96s of the 86.94s of remaining time.
-53.4065 = Validation score (-root_mean_squared_error)
26.52s = Training runtime
0.64s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 57.26s of the 57.23s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-55.7444 = Validation score (-root_mean_squared_error)
59.33s = Training runtime
0.06s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -6.3s of remaining time.
-53.1096 = Validation score (-root_mean_squared_error)
0.39s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 606.92s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_115948/")
predictor.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -53.109600 14.614315 553.148430 0.001307 0.387465 3 True 14
1 RandomForestMSE_BAG_L2 -53.406479 10.585600 411.811534 0.637849 26.515250 2 True 12
2 LightGBM_BAG_L2 -55.217867 10.165047 411.070628 0.217296 25.774344 2 True 11
3 CatBoost_BAG_L2 -55.744445 10.012650 444.625488 0.064899 59.329205 2 True 13
4 LightGBMXT_BAG_L2 -60.394630 13.692964 441.142166 3.745213 55.845882 2 True 10
5 KNeighborsDist_BAG_L1 -84.125061 0.103698 0.029283 0.103698 0.029283 1 True 2
6 WeightedEnsemble_L2 -84.125061 0.104856 0.556873 0.001159 0.527590 2 True 9
7 KNeighborsUnif_BAG_L1 -101.546199 0.103664 0.031033 0.103664 0.031033 1 True 1
8 RandomForestMSE_BAG_L1 -116.544294 0.569369 10.905250 0.569369 10.905250 1 True 5
9 ExtraTreesMSE_BAG_L1 -124.588053 0.554608 5.158636 0.554608 5.158636 1 True 7
10 CatBoost_BAG_L1 -130.533194 0.177396 201.808275 0.177396 201.808275 1 True 6
11 LightGBM_BAG_L1 -131.054162 1.472807 30.420964 1.472807 30.420964 1 True 4
12 LightGBMXT_BAG_L1 -131.460909 6.551322 65.351942 6.551322 65.351942 1 True 3
13 NeuralNetFastAI_BAG_L1 -138.372209 0.414887 71.590901 0.414887 71.590901 1 True 8
Number of models trained: 14
Types of models trained:
{'StackerEnsembleModel_NNFastAiTabular', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 3 | ['season', 'weather', 'humidity']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_115948/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -131.46090891834504,
'LightGBM_BAG_L1': -131.054161598899,
'RandomForestMSE_BAG_L1': -116.54429428704391,
'CatBoost_BAG_L1': -130.5331939673838,
'ExtraTreesMSE_BAG_L1': -124.58805258915959,
'NeuralNetFastAI_BAG_L1': -138.37220877327402,
'WeightedEnsemble_L2': -84.12506123181602,
'LightGBMXT_BAG_L2': -60.394630458831784,
'LightGBM_BAG_L2': -55.21786685203879,
'RandomForestMSE_BAG_L2': -53.40647918962767,
'CatBoost_BAG_L2': -55.74444485320961,
'WeightedEnsemble_L3': -53.109600407057876},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221226_115948/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_115948/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_115948/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_115948/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_115948/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_115948/models/CatBoost_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_115948/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.031032800674438477,
'KNeighborsDist_BAG_L1': 0.02928328514099121,
'LightGBMXT_BAG_L1': 65.35194182395935,
'LightGBM_BAG_L1': 30.420964002609253,
'RandomForestMSE_BAG_L1': 10.90524959564209,
'CatBoost_BAG_L1': 201.8082754611969,
'ExtraTreesMSE_BAG_L1': 5.15863561630249,
'NeuralNetFastAI_BAG_L1': 71.59090113639832,
'WeightedEnsemble_L2': 0.5275900363922119,
'LightGBMXT_BAG_L2': 55.845882415771484,
'LightGBM_BAG_L2': 25.774344205856323,
'RandomForestMSE_BAG_L2': 26.515249967575073,
'CatBoost_BAG_L2': 59.32920455932617,
'WeightedEnsemble_L3': 0.3874647617340088},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10366415977478027,
'KNeighborsDist_BAG_L1': 0.1036977767944336,
'LightGBMXT_BAG_L1': 6.551321506500244,
'LightGBM_BAG_L1': 1.4728071689605713,
'RandomForestMSE_BAG_L1': 0.5693690776824951,
'CatBoost_BAG_L1': 0.17739629745483398,
'ExtraTreesMSE_BAG_L1': 0.5546078681945801,
'NeuralNetFastAI_BAG_L1': 0.4148869514465332,
'WeightedEnsemble_L2': 0.0011587142944335938,
'LightGBMXT_BAG_L2': 3.7452127933502197,
'LightGBM_BAG_L2': 0.21729588508605957,
'RandomForestMSE_BAG_L2': 0.6378493309020996,
'CatBoost_BAG_L2': 0.06489896774291992,
'WeightedEnsemble_L3': 0.001306772232055664},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -53.109600 14.614315 553.148430
1 RandomForestMSE_BAG_L2 -53.406479 10.585600 411.811534
2 LightGBM_BAG_L2 -55.217867 10.165047 411.070628
3 CatBoost_BAG_L2 -55.744445 10.012650 444.625488
4 LightGBMXT_BAG_L2 -60.394630 13.692964 441.142166
5 KNeighborsDist_BAG_L1 -84.125061 0.103698 0.029283
6 WeightedEnsemble_L2 -84.125061 0.104856 0.556873
7 KNeighborsUnif_BAG_L1 -101.546199 0.103664 0.031033
8 RandomForestMSE_BAG_L1 -116.544294 0.569369 10.905250
9 ExtraTreesMSE_BAG_L1 -124.588053 0.554608 5.158636
10 CatBoost_BAG_L1 -130.533194 0.177396 201.808275
11 LightGBM_BAG_L1 -131.054162 1.472807 30.420964
12 LightGBMXT_BAG_L1 -131.460909 6.551322 65.351942
13 NeuralNetFastAI_BAG_L1 -138.372209 0.414887 71.590901
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001307 0.387465 3 True
1 0.637849 26.515250 2 True
2 0.217296 25.774344 2 True
3 0.064899 59.329205 2 True
4 3.745213 55.845882 2 True
5 0.103698 0.029283 1 True
6 0.001159 0.527590 2 True
7 0.103664 0.031033 1 True
8 0.569369 10.905250 1 True
9 0.554608 5.158636 1 True
10 0.177396 201.808275 1 True
11 1.472807 30.420964 1 True
12 6.551322 65.351942 1 True
13 0.414887 71.590901 1 True
fit_order
0 14
1 12
2 11
3 13
4 10
5 2
6 9
7 1
8 5
9 7
10 6
11 4
12 3
13 8 }
predictor.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -53.109600 | 14.614315 | 553.148430 | 0.001307 | 0.387465 | 3 | True | 14 |
| 1 | RandomForestMSE_BAG_L2 | -53.406479 | 10.585600 | 411.811534 | 0.637849 | 26.515250 | 2 | True | 12 |
| 2 | LightGBM_BAG_L2 | -55.217867 | 10.165047 | 411.070628 | 0.217296 | 25.774344 | 2 | True | 11 |
| 3 | CatBoost_BAG_L2 | -55.744445 | 10.012650 | 444.625488 | 0.064899 | 59.329205 | 2 | True | 13 |
| 4 | LightGBMXT_BAG_L2 | -60.394630 | 13.692964 | 441.142166 | 3.745213 | 55.845882 | 2 | True | 10 |
| 5 | KNeighborsDist_BAG_L1 | -84.125061 | 0.103698 | 0.029283 | 0.103698 | 0.029283 | 1 | True | 2 |
| 6 | WeightedEnsemble_L2 | -84.125061 | 0.104856 | 0.556873 | 0.001159 | 0.527590 | 2 | True | 9 |
| 7 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.103664 | 0.031033 | 0.103664 | 0.031033 | 1 | True | 1 |
| 8 | RandomForestMSE_BAG_L1 | -116.544294 | 0.569369 | 10.905250 | 0.569369 | 10.905250 | 1 | True | 5 |
| 9 | ExtraTreesMSE_BAG_L1 | -124.588053 | 0.554608 | 5.158636 | 0.554608 | 5.158636 | 1 | True | 7 |
| 10 | CatBoost_BAG_L1 | -130.533194 | 0.177396 | 201.808275 | 0.177396 | 201.808275 | 1 | True | 6 |
| 11 | LightGBM_BAG_L1 | -131.054162 | 1.472807 | 30.420964 | 1.472807 | 30.420964 | 1 | True | 4 |
| 12 | LightGBMXT_BAG_L1 | -131.460909 | 6.551322 | 65.351942 | 6.551322 | 65.351942 | 1 | True | 3 |
| 13 | NeuralNetFastAI_BAG_L1 | -138.372209 | 0.414887 | 71.590901 | 0.414887 | 71.590901 | 1 | True | 8 |
fig = predictor.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_1_leaderboard.png')
predictions = predictor.predict(test)
predictions.head()
0 23.355633 1 41.986988 2 45.374504 3 49.315769 4 52.064514 Name: count, dtype: float32
# Describe the `predictions` series to see if there are any negative values
predictions.describe()
count 6493.000000 mean 100.940247 std 89.856956 min 3.016292 25% 20.067400 50% 64.116325 75% 167.614639 max 365.451843 Name: count, dtype: float64
# How many negative values do we have?
(predictions<0).sum()
0
# All values are non-negative - nothing to worry about
submission["count"] = predictions
submission.to_csv("submission.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission.csv -m "initial submission 1"
100%|█████████████████████████████████████████| 188k/188k [00:00<00:00, 316kB/s] Successfully submitted to Bike Sharing Demand
My Submissions¶!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission.csv 2022-12-26 12:10:34 initial submission 1 complete 1.79067 1.79067
submission_new_hpo_3f.csv 2022-12-25 20:15:51 hp tuning 3f complete 0.50219 0.50219
submission_new_hpo_3e.csv 2022-12-25 19:59:07 hp tuning 3e complete 0.53176 0.53176
submission_new_hpo_3c.csv 2022-12-25 19:35:06 hpo 3c num_bag_sets = 5 complete 0.63215 0.63215
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
# Create a histogram of all features to show the distribution of each one relative to the data. This is part of the exploratory data analysis
train.hist(figsize = (20,10))
plt.show()
## Let's visualize correlations using heatmap
plt.figure(figsize=(10,8))
sns.heatmap(train.corr())
plt.show()
## Let's view pairwise scatterplots with weather facet
sns.pairplot(train, hue="weather")
plt.show()
Note: Weather description 1: Clear, Few clouds, Partly cloudy, Partly cloudy 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog
plt.figure(figsize=(10,5))
plt.plot(train['datetime'],train['count'].ewm(span = 24).mean())
plt.title('Bike Sharing Demand over 2011-2022.')
plt.xlabel('Timeframe')
plt.ylabel('Hourly bike count')
start_number=0
end_number = len(train['datetime'])
step_number = 24*30
plt.xticks(range(start_number,end_number,step_number),rotation=90)
plt.show()
## Observation: there is a general growth trend (2011 vs 2012)
## Let's look at week timeframe
filt = (train['datetime']>='2011-01-03') & (train['datetime']<='2011-01-10')
plt.figure(figsize=(8,4))
plt.plot(train[filt]['datetime'],train[filt]['count'])
plt.title('Bike Sharing Demand from Jan 3 2011 til Jan 10 2011.')
plt.xlabel('Timeframe')
plt.ylabel('Hourly bike count')
start_number=0
end_number = len(train[filt]['datetime'])
step_number = 10
plt.grid(alpha=0.3)
plt.xticks(range(start_number,end_number,step_number),rotation=90)
plt.show()
## observations: high hourly seasonality, low demand at weekends
## Let's look at day timeframe
plt.figure(figsize=(8,4))
filt = (train['datetime']>='2011-01-04') & (train['datetime']<'2011-01-05')
plt.plot(train[filt]['datetime'],train[filt]['count'])
plt.title('Bike Sharing Demand for Jan 4 2011.')
plt.xlabel('Timeframe')
plt.ylabel('Hourly bike count')
plt.grid(alpha=0.3)
plt.xticks(rotation=90)
plt.show()
## Let's look at day timeframe
plt.figure(figsize=(8,4))
filt = (train['datetime']>='2012-01-05') & (train['datetime']<'2012-01-06')
plt.plot(train[filt]['datetime'],train[filt]['count'])
plt.title('Bike Sharing Demand for Jan 5 2012.')
plt.xlabel('Timeframe')
plt.ylabel('Hourly bike count')
plt.grid(alpha=0.3)
plt.xticks(rotation=90)
plt.show()
## Observation: there ara 3 spikes in demand observed across morning (7am - 9am), lunch (11am - 1pm), and evening (4 - 7pm). On the other hand, demand tends to fall to its lowest levels starting from 11PM till 6AM.
# create a new feature
train['datetime'] = pd.to_datetime(train['datetime'])
train['datetime_hour'] = train['datetime'].dt.hour
train['datetime_day'] = train['datetime'].dt.day
train['datetime_week'] = train['datetime'].dt.week
train['datetime_month'] = train['datetime'].dt.month
train['datetime_year'] = train['datetime'].dt.year
train['datetime_dayofweek'] = train['datetime'].dt.dayofweek
test['datetime'] = pd.to_datetime(test['datetime'])
test['datetime_hour'] = test['datetime'].dt.hour
test['datetime_day'] = test['datetime'].dt.day
test['datetime_week'] = test['datetime'].dt.week
test['datetime_month'] = test['datetime'].dt.month
test['datetime_year'] = test['datetime'].dt.year
test['datetime_dayofweek'] = test['datetime'].dt.dayofweek
/usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:6: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead. /usr/local/lib/python3.7/site-packages/ipykernel_launcher.py:14: FutureWarning: Series.dt.weekofyear and Series.dt.week have been deprecated. Please use Series.dt.isocalendar().week instead.
## adding extra hour categories: morning, lunch, evening, none
def extract_hour_category(h):
if h in [7,8,9]:
return 1
elif h in [11,12,13]:
return 2
elif h in [17,18,19]:
return 3
elif h in [23,0,1,2,3,4,5]:
return 4
else:
return 0
train['hour_category'] = train['datetime_hour'].apply(lambda x: extract_hour_category(x))
test['hour_category'] = test['datetime_hour'].apply(lambda x: extract_hour_category(x))
train["season"] = train["season"].astype("category")
train["weather"] = train["weather"].astype("category")
test["season"] = test["season"].astype("category")
test["weather"] = test["weather"].astype("category")
train['hour_category'] = train["hour_category"].astype("category")
test['hour_category'] = test["hour_category"].astype("category")
# View are new feature
train.head()
| datetime | season | holiday | workingday | weather | temp | atemp | humidity | windspeed | count | datetime_hour | datetime_day | datetime_month | datetime_year | datetime_dayofweek | datetime_week | hour_category | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 2011-01-01 00:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 81 | 0.0 | 16 | 0 | 1 | 1 | 2011 | 5 | 52 | 4 |
| 1 | 2011-01-01 01:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 40 | 1 | 1 | 1 | 2011 | 5 | 52 | 4 |
| 2 | 2011-01-01 02:00:00 | 1 | 0 | 0 | 1 | 9.02 | 13.635 | 80 | 0.0 | 32 | 2 | 1 | 1 | 2011 | 5 | 52 | 4 |
| 3 | 2011-01-01 03:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 13 | 3 | 1 | 1 | 2011 | 5 | 52 | 4 |
| 4 | 2011-01-01 04:00:00 | 1 | 0 | 0 | 1 | 9.84 | 14.395 | 75 | 0.0 | 1 | 4 | 1 | 1 | 2011 | 5 | 52 | 4 |
# View histogram of all features again now with the hour feature
import matplotlib.pyplot as plt
train.hist(figsize = (20,12))
plt.show()
sns.pairplot(train, hue="hour_category")
<seaborn.axisgrid.PairGrid at 0x7f5290cdbd90>
filt = (train['datetime_year']==2012) #& (train['datetime_month']==4)
plt.bar(train[filt]['datetime_hour'],train[filt]['count'])
plt.axhline(train[filt]['count'].quantile(0.75),c='black',linestyle='--')
plt.axvline(7, c='r', linestyle='--')
plt.axvline(9, c='r', linestyle='--')
plt.axvline(11, c='r', linestyle='--')
plt.axvline(13, c='r', linestyle='--')
plt.axvline(16, c='r', linestyle='--')
plt.axvline(19, c='r', linestyle='--')
<matplotlib.lines.Line2D at 0x7f528ad90550>
train['count'].describe()
count 10886.000000 mean 191.574132 std 181.144454 min 1.000000 25% 42.000000 50% 145.000000 75% 284.000000 max 977.000000 Name: count, dtype: float64
predictor_new_features_2a = TabularPredictor(
label = 'count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train,
time_limit = 600,
presets='best_quality'
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_142547/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_142547/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 16
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 5327.31 MB
Train Data (Original) Memory Usage: 1.17 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 9 | ['holiday', 'workingday', 'humidity', 'datetime_hour', 'datetime_day', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.3s = Fit runtime
16 features in original data used to generate 20 features in processed data.
Train Data (Processed) Memory Usage: 1.29 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.36s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.66s of the 599.64s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.07s = Training runtime
0.11s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.23s of the 599.21s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.06s = Training runtime
0.11s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.78s of the 598.76s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-32.9724 = Validation score (-root_mean_squared_error)
105.5s = Training runtime
17.44s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 280.63s of the 480.61s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5406 = Validation score (-root_mean_squared_error)
52.79s = Training runtime
3.46s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 201.19s of the 401.17s of remaining time.
-38.2831 = Validation score (-root_mean_squared_error)
17.86s = Training runtime
0.9s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 179.72s of the 379.7s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-34.2232 = Validation score (-root_mean_squared_error)
171.68s = Training runtime
0.16s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 188.41s of remaining time.
-31.6532 = Validation score (-root_mean_squared_error)
0.67s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 187.64s of the 187.62s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.1273 = Validation score (-root_mean_squared_error)
35.7s = Training runtime
0.7s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 147.59s of the 147.57s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.4888 = Validation score (-root_mean_squared_error)
27.82s = Training runtime
0.31s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 115.19s of the 115.17s of remaining time.
-31.3546 = Validation score (-root_mean_squared_error)
29.86s = Training runtime
0.68s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 82.25s of the 82.23s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.3779 = Validation score (-root_mean_squared_error)
79.54s = Training runtime
0.13s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -1.76s of remaining time.
-30.1069 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 602.26s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_142547/")
predictor_new_features_2a.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.106888 24.001545 521.170630 0.001105 0.292291 3 True 12
1 CatBoost_BAG_L2 -30.377937 22.302106 427.497371 0.125932 79.537608 2 True 11
2 LightGBM_BAG_L2 -30.488781 22.488722 375.780792 0.312548 27.821029 2 True 9
3 LightGBMXT_BAG_L2 -31.127338 22.881001 383.662378 0.704827 35.702615 2 True 8
4 RandomForestMSE_BAG_L2 -31.354564 22.857134 377.817086 0.680960 29.857323 2 True 10
5 WeightedEnsemble_L2 -31.653175 22.069522 348.557548 0.001258 0.665796 2 True 7
6 LightGBMXT_BAG_L1 -32.972358 17.436497 105.503806 17.436497 105.503806 1 True 3
7 LightGBM_BAG_L1 -33.540630 3.461847 52.791747 3.461847 52.791747 1 True 4
8 CatBoost_BAG_L1 -34.223240 0.159176 171.680278 0.159176 171.680278 1 True 6
9 RandomForestMSE_BAG_L1 -38.283140 0.903128 17.857386 0.903128 17.857386 1 True 5
10 KNeighborsDist_BAG_L1 -84.125061 0.107616 0.058535 0.107616 0.058535 1 True 2
11 KNeighborsUnif_BAG_L1 -101.546199 0.107910 0.068011 0.107910 0.068011 1 True 1
Number of models trained: 12
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_142547/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -32.972357766618615,
'LightGBM_BAG_L1': -33.54062969122618,
'RandomForestMSE_BAG_L1': -38.28313968009453,
'CatBoost_BAG_L1': -34.2232398416045,
'WeightedEnsemble_L2': -31.653175350057957,
'LightGBMXT_BAG_L2': -31.12733811020385,
'LightGBM_BAG_L2': -30.488780530107636,
'RandomForestMSE_BAG_L2': -31.354563539567376,
'CatBoost_BAG_L2': -30.37793710395594,
'WeightedEnsemble_L3': -30.10688847519134},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_142547/models/CatBoost_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_142547/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_142547/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_142547/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_142547/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_142547/models/CatBoost_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_142547/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.06801056861877441,
'KNeighborsDist_BAG_L1': 0.058534860610961914,
'LightGBMXT_BAG_L1': 105.5038058757782,
'LightGBM_BAG_L1': 52.791746854782104,
'RandomForestMSE_BAG_L1': 17.857386350631714,
'CatBoost_BAG_L1': 171.680278301239,
'WeightedEnsemble_L2': 0.6657960414886475,
'LightGBMXT_BAG_L2': 35.7026150226593,
'LightGBM_BAG_L2': 27.82102870941162,
'RandomForestMSE_BAG_L2': 29.85732340812683,
'CatBoost_BAG_L2': 79.53760838508606,
'WeightedEnsemble_L3': 0.29229116439819336},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.1079099178314209,
'KNeighborsDist_BAG_L1': 0.10761642456054688,
'LightGBMXT_BAG_L1': 17.43649673461914,
'LightGBM_BAG_L1': 3.4618468284606934,
'RandomForestMSE_BAG_L1': 0.903127908706665,
'CatBoost_BAG_L1': 0.15917611122131348,
'WeightedEnsemble_L2': 0.0012576580047607422,
'LightGBMXT_BAG_L2': 0.704827070236206,
'LightGBM_BAG_L2': 0.3125481605529785,
'RandomForestMSE_BAG_L2': 0.6809597015380859,
'CatBoost_BAG_L2': 0.1259317398071289,
'WeightedEnsemble_L3': 0.0011048316955566406},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.106888 24.001545 521.170630
1 CatBoost_BAG_L2 -30.377937 22.302106 427.497371
2 LightGBM_BAG_L2 -30.488781 22.488722 375.780792
3 LightGBMXT_BAG_L2 -31.127338 22.881001 383.662378
4 RandomForestMSE_BAG_L2 -31.354564 22.857134 377.817086
5 WeightedEnsemble_L2 -31.653175 22.069522 348.557548
6 LightGBMXT_BAG_L1 -32.972358 17.436497 105.503806
7 LightGBM_BAG_L1 -33.540630 3.461847 52.791747
8 CatBoost_BAG_L1 -34.223240 0.159176 171.680278
9 RandomForestMSE_BAG_L1 -38.283140 0.903128 17.857386
10 KNeighborsDist_BAG_L1 -84.125061 0.107616 0.058535
11 KNeighborsUnif_BAG_L1 -101.546199 0.107910 0.068011
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001105 0.292291 3 True
1 0.125932 79.537608 2 True
2 0.312548 27.821029 2 True
3 0.704827 35.702615 2 True
4 0.680960 29.857323 2 True
5 0.001258 0.665796 2 True
6 17.436497 105.503806 1 True
7 3.461847 52.791747 1 True
8 0.159176 171.680278 1 True
9 0.903128 17.857386 1 True
10 0.107616 0.058535 1 True
11 0.107910 0.068011 1 True
fit_order
0 12
1 11
2 9
3 8
4 10
5 7
6 3
7 4
8 6
9 5
10 2
11 1 }
predictor_new_features_2a.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -30.106888 | 24.001545 | 521.170630 | 0.001105 | 0.292291 | 3 | True | 12 |
| 1 | CatBoost_BAG_L2 | -30.377937 | 22.302106 | 427.497371 | 0.125932 | 79.537608 | 2 | True | 11 |
| 2 | LightGBM_BAG_L2 | -30.488781 | 22.488722 | 375.780792 | 0.312548 | 27.821029 | 2 | True | 9 |
| 3 | LightGBMXT_BAG_L2 | -31.127338 | 22.881001 | 383.662378 | 0.704827 | 35.702615 | 2 | True | 8 |
| 4 | RandomForestMSE_BAG_L2 | -31.354564 | 22.857134 | 377.817086 | 0.680960 | 29.857323 | 2 | True | 10 |
| 5 | WeightedEnsemble_L2 | -31.653175 | 22.069522 | 348.557548 | 0.001258 | 0.665796 | 2 | True | 7 |
| 6 | LightGBMXT_BAG_L1 | -32.972358 | 17.436497 | 105.503806 | 17.436497 | 105.503806 | 1 | True | 3 |
| 7 | LightGBM_BAG_L1 | -33.540630 | 3.461847 | 52.791747 | 3.461847 | 52.791747 | 1 | True | 4 |
| 8 | CatBoost_BAG_L1 | -34.223240 | 0.159176 | 171.680278 | 0.159176 | 171.680278 | 1 | True | 6 |
| 9 | RandomForestMSE_BAG_L1 | -38.283140 | 0.903128 | 17.857386 | 0.903128 | 17.857386 | 1 | True | 5 |
| 10 | KNeighborsDist_BAG_L1 | -84.125061 | 0.107616 | 0.058535 | 0.107616 | 0.058535 | 1 | True | 2 |
| 11 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.107910 | 0.068011 | 0.107910 | 0.068011 | 1 | True | 1 |
fig = predictor_new_features_2a.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_2a_leaderboard.png')
# Remember to set all negative values to zero
predictions_new_features_2a = predictor_new_features_2a.predict(test)
predictions_new_features_2a.describe()
count 6493.000000 mean 157.081726 std 136.723343 min 2.417695 25% 51.513840 50% 120.013924 75% 223.343887 max 810.798950 Name: count, dtype: float64
(predictions_new_features_2a<0).sum()
0
predictions_new_features_2a = predictions_new_features_2a.apply(lambda x: 0 if x<0 else x)
submission_new_features_2a = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_features_2a["count"] = predictions_new_features_2a
submission_new_features_2a.to_csv("submission_new_features_2a.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features_2a.csv -m "new features 2a"
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 10
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission_new_hpo_3e.csv 2022-12-26 13:59:46 hp tuning 3e complete 0.49307 0.49307
submission_new_hpo_3d.csv 2022-12-26 13:50:52 hp tuning 3d complete 0.52253 0.52253
submission_new_hpo_3c.csv 2022-12-26 13:46:38 hpo 3c num_bag_sets = 5 complete 0.62247 0.62247
submission_new_hpo_3b.csv 2022-12-26 13:35:38 hpo 3b num_bag_folds = 10 complete 0.63100 0.63100
submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835
submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357
submission_new_features_2a.csv 2022-12-26 12:35:34 new features 2a complete 0.62078 0.62078
submission.csv 2022-12-26 12:10:34 initial submission 1 complete 1.79067 1.79067
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
['datetime_month','datetime_day','datetime_dayofweek', 'datetime_year'] to check if removing duplication improves final performance.predictor_new_features_2b = TabularPredictor(
label = 'count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train.drop(columns = ['datetime_month','datetime_day','datetime_dayofweek','datetime_week', 'datetime_year']),
time_limit = 600,
presets='best_quality',
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_124540/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_124540/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 11
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 5263.32 MB
Train Data (Original) Memory Usage: 0.73 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 2 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 4 | ['holiday', 'workingday', 'humidity', 'datetime_hour']
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 2 | ['humidity', 'datetime_hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.2s = Fit runtime
11 features in original data used to generate 15 features in processed data.
Train Data (Processed) Memory Usage: 0.93 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.3s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.7s of the 599.7s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.11s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.32s of the 599.31s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.03s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.95s of the 598.94s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.0698 = Validation score (-root_mean_squared_error)
89.33s = Training runtime
17.9s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 299.87s of the 499.86s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5413 = Validation score (-root_mean_squared_error)
47.52s = Training runtime
3.57s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 247.65s of the 447.65s of remaining time.
-38.3046 = Validation score (-root_mean_squared_error)
12.68s = Training runtime
0.63s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 231.88s of the 431.87s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.9384 = Validation score (-root_mean_squared_error)
199.87s = Training runtime
0.21s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 27.52s of the 227.51s of remaining time.
-37.8411 = Validation score (-root_mean_squared_error)
5.98s = Training runtime
0.61s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 18.38s of the 218.38s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-72.7884 = Validation score (-root_mean_squared_error)
41.24s = Training runtime
0.47s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 172.81s of remaining time.
-31.7363 = Validation score (-root_mean_squared_error)
0.5s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 172.23s of the 172.2s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.004 = Validation score (-root_mean_squared_error)
31.22s = Training runtime
1.01s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 135.93s of the 135.91s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.5483 = Validation score (-root_mean_squared_error)
26.1s = Training runtime
0.33s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 105.54s of the 105.51s of remaining time.
-31.3624 = Validation score (-root_mean_squared_error)
30.39s = Training runtime
0.67s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 72.05s of the 72.02s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.3886 = Validation score (-root_mean_squared_error)
71.45s = Training runtime
0.12s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -3.8s of remaining time.
-30.1509 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 604.3s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_124540/")
predictor_new_features_2b.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.150922 25.725962 556.149490 0.000877 0.293293 3 True 14
1 CatBoost_BAG_L2 -30.388608 23.719227 468.144321 0.119392 71.445114 2 True 13
2 LightGBM_BAG_L2 -30.548320 23.927661 422.797380 0.327826 26.098173 2 True 11
3 LightGBMXT_BAG_L2 -31.003969 24.606218 427.920316 1.006384 31.221109 2 True 10
4 RandomForestMSE_BAG_L2 -31.362428 24.271482 427.091802 0.671648 30.392595 2 True 12
5 WeightedEnsemble_L2 -31.736281 22.411067 349.937558 0.000895 0.503409 2 True 9
6 LightGBMXT_BAG_L1 -33.069780 17.895373 89.332585 17.895373 89.332585 1 True 3
7 LightGBM_BAG_L1 -33.541281 3.571695 47.522052 3.571695 47.522052 1 True 4
8 CatBoost_BAG_L1 -33.938428 0.207587 199.872878 0.207587 199.872878 1 True 6
9 ExtraTreesMSE_BAG_L1 -37.841138 0.608934 5.977964 0.608934 5.977964 1 True 7
10 RandomForestMSE_BAG_L1 -38.304598 0.631685 12.677532 0.631685 12.677532 1 True 5
11 NeuralNetFastAI_BAG_L1 -72.788383 0.474992 41.243789 0.474992 41.243789 1 True 8
12 KNeighborsDist_BAG_L1 -84.125061 0.103832 0.029102 0.103832 0.029102 1 True 2
13 KNeighborsUnif_BAG_L1 -101.546199 0.105737 0.043305 0.105737 0.043305 1 True 1
Number of models trained: 14
Types of models trained:
{'StackerEnsembleModel_NNFastAiTabular', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 2 | ['humidity', 'datetime_hour']
('int', ['bool']) : 2 | ['holiday', 'workingday']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_124540/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -33.06977986045687,
'LightGBM_BAG_L1': -33.541281049845416,
'RandomForestMSE_BAG_L1': -38.30459792418722,
'CatBoost_BAG_L1': -33.938427705209676,
'ExtraTreesMSE_BAG_L1': -37.84113795024617,
'NeuralNetFastAI_BAG_L1': -72.78838342939757,
'WeightedEnsemble_L2': -31.73628083370923,
'LightGBMXT_BAG_L2': -31.00396851804603,
'LightGBM_BAG_L2': -30.54831969680547,
'RandomForestMSE_BAG_L2': -31.362428429852514,
'CatBoost_BAG_L2': -30.388607844819283,
'WeightedEnsemble_L3': -30.150921718093382},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221226_124540/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_124540/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_124540/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_124540/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_124540/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_124540/models/CatBoost_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_124540/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.043305397033691406,
'KNeighborsDist_BAG_L1': 0.02910161018371582,
'LightGBMXT_BAG_L1': 89.33258485794067,
'LightGBM_BAG_L1': 47.52205228805542,
'RandomForestMSE_BAG_L1': 12.677532434463501,
'CatBoost_BAG_L1': 199.87287783622742,
'ExtraTreesMSE_BAG_L1': 5.97796368598938,
'NeuralNetFastAI_BAG_L1': 41.243788957595825,
'WeightedEnsemble_L2': 0.503408670425415,
'LightGBMXT_BAG_L2': 31.22110891342163,
'LightGBM_BAG_L2': 26.098172664642334,
'RandomForestMSE_BAG_L2': 30.392594814300537,
'CatBoost_BAG_L2': 71.44511413574219,
'WeightedEnsemble_L3': 0.2932925224304199},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10573673248291016,
'KNeighborsDist_BAG_L1': 0.10383224487304688,
'LightGBMXT_BAG_L1': 17.89537262916565,
'LightGBM_BAG_L1': 3.5716946125030518,
'RandomForestMSE_BAG_L1': 0.6316852569580078,
'CatBoost_BAG_L1': 0.20758748054504395,
'ExtraTreesMSE_BAG_L1': 0.608933687210083,
'NeuralNetFastAI_BAG_L1': 0.4749917984008789,
'WeightedEnsemble_L2': 0.0008945465087890625,
'LightGBMXT_BAG_L2': 1.0063836574554443,
'LightGBM_BAG_L2': 0.32782626152038574,
'RandomForestMSE_BAG_L2': 0.6716480255126953,
'CatBoost_BAG_L2': 0.11939239501953125,
'WeightedEnsemble_L3': 0.0008769035339355469},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.150922 25.725962 556.149490
1 CatBoost_BAG_L2 -30.388608 23.719227 468.144321
2 LightGBM_BAG_L2 -30.548320 23.927661 422.797380
3 LightGBMXT_BAG_L2 -31.003969 24.606218 427.920316
4 RandomForestMSE_BAG_L2 -31.362428 24.271482 427.091802
5 WeightedEnsemble_L2 -31.736281 22.411067 349.937558
6 LightGBMXT_BAG_L1 -33.069780 17.895373 89.332585
7 LightGBM_BAG_L1 -33.541281 3.571695 47.522052
8 CatBoost_BAG_L1 -33.938428 0.207587 199.872878
9 ExtraTreesMSE_BAG_L1 -37.841138 0.608934 5.977964
10 RandomForestMSE_BAG_L1 -38.304598 0.631685 12.677532
11 NeuralNetFastAI_BAG_L1 -72.788383 0.474992 41.243789
12 KNeighborsDist_BAG_L1 -84.125061 0.103832 0.029102
13 KNeighborsUnif_BAG_L1 -101.546199 0.105737 0.043305
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000877 0.293293 3 True
1 0.119392 71.445114 2 True
2 0.327826 26.098173 2 True
3 1.006384 31.221109 2 True
4 0.671648 30.392595 2 True
5 0.000895 0.503409 2 True
6 17.895373 89.332585 1 True
7 3.571695 47.522052 1 True
8 0.207587 199.872878 1 True
9 0.608934 5.977964 1 True
10 0.631685 12.677532 1 True
11 0.474992 41.243789 1 True
12 0.103832 0.029102 1 True
13 0.105737 0.043305 1 True
fit_order
0 14
1 13
2 11
3 10
4 12
5 9
6 3
7 4
8 6
9 7
10 5
11 8
12 2
13 1 }
predictor_new_features_2b.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -30.150922 | 25.725962 | 556.149490 | 0.000877 | 0.293293 | 3 | True | 14 |
| 1 | CatBoost_BAG_L2 | -30.388608 | 23.719227 | 468.144321 | 0.119392 | 71.445114 | 2 | True | 13 |
| 2 | LightGBM_BAG_L2 | -30.548320 | 23.927661 | 422.797380 | 0.327826 | 26.098173 | 2 | True | 11 |
| 3 | LightGBMXT_BAG_L2 | -31.003969 | 24.606218 | 427.920316 | 1.006384 | 31.221109 | 2 | True | 10 |
| 4 | RandomForestMSE_BAG_L2 | -31.362428 | 24.271482 | 427.091802 | 0.671648 | 30.392595 | 2 | True | 12 |
| 5 | WeightedEnsemble_L2 | -31.736281 | 22.411067 | 349.937558 | 0.000895 | 0.503409 | 2 | True | 9 |
| 6 | LightGBMXT_BAG_L1 | -33.069780 | 17.895373 | 89.332585 | 17.895373 | 89.332585 | 1 | True | 3 |
| 7 | LightGBM_BAG_L1 | -33.541281 | 3.571695 | 47.522052 | 3.571695 | 47.522052 | 1 | True | 4 |
| 8 | CatBoost_BAG_L1 | -33.938428 | 0.207587 | 199.872878 | 0.207587 | 199.872878 | 1 | True | 6 |
| 9 | ExtraTreesMSE_BAG_L1 | -37.841138 | 0.608934 | 5.977964 | 0.608934 | 5.977964 | 1 | True | 7 |
| 10 | RandomForestMSE_BAG_L1 | -38.304598 | 0.631685 | 12.677532 | 0.631685 | 12.677532 | 1 | True | 5 |
| 11 | NeuralNetFastAI_BAG_L1 | -72.788383 | 0.474992 | 41.243789 | 0.474992 | 41.243789 | 1 | True | 8 |
| 12 | KNeighborsDist_BAG_L1 | -84.125061 | 0.103832 | 0.029102 | 0.103832 | 0.029102 | 1 | True | 2 |
| 13 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.105737 | 0.043305 | 0.105737 | 0.043305 | 1 | True | 1 |
fig = predictor_new_features_2b.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_2b_leaderboard.png')
# Remember to set all negative values to zero
predictions_new_features_2b = predictor_new_features_2b.predict(test)
predictions_new_features_2b.describe()
count 6493.000000 mean 156.803406 std 135.177689 min 2.408113 25% 51.911545 50% 122.218353 75% 226.373245 max 794.151550 Name: count, dtype: float64
(predictions_new_features_2b<0).sum()
0
predictions_new_features_2b = predictions_new_features_2b.apply(lambda x: 0 if x<0 else x)
submission_new_features_2b = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_features_2b["count"] = predictions_new_features_2b
submission_new_features_2b.to_csv("submission_new_features_2b.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_features_2b.csv -m "new features 2b"
100%|█████████████████████████████████████████| 243k/243k [00:00<00:00, 434kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 10
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357
submission_new_features_2a.csv 2022-12-26 12:35:34 new features 2a complete 0.62078 0.62078
submission.csv 2022-12-26 12:10:34 initial submission 1 complete 1.79067 1.79067
submission_new_hpo_3f.csv 2022-12-25 20:15:51 hp tuning 3f complete 0.50219 0.50219
submission_new_hpo_3e.csv 2022-12-25 19:59:07 hp tuning 3e complete 0.53176 0.53176
submission_new_hpo_3c.csv 2022-12-25 19:35:06 hpo 3c num_bag_sets = 5 complete 0.63215 0.63215
submission_new_hpo_3b.csv 2022-12-25 19:23:27 hpo 3b num_bag_folds = 10 complete 0.79137 0.79137
submission_new_hpo_3a.csv 2022-12-25 18:45:12 hpo 3a num_stack_levels = 2 complete 0.67217 0.67217
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
hyperparameter and hyperparameter_tune_kwargs arguments.# Default ensembling/stacking hpo configuration: num_stack_levels=1, num_bag_folds=8, num_bag_sets=20
predictor_new_hpo_3a = TabularPredictor(
label = 'count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train,
time_limit = 600,
presets='best_quality',
num_stack_levels = 2,
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_131328/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=2, num_bag_folds=8, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_131328/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 16
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 5683.82 MB
Train Data (Original) Memory Usage: 1.17 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 9 | ['holiday', 'workingday', 'humidity', 'datetime_hour', 'datetime_day', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.3s = Fit runtime
16 features in original data used to generate 20 features in processed data.
Train Data (Processed) Memory Usage: 1.29 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.32s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 3 stack levels (L1 to L3) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 266.45s of the 599.67s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.05s = Training runtime
0.11s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 266.05s of the 599.26s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 265.64s of the 598.85s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-32.9724 = Validation score (-root_mean_squared_error)
97.27s = Training runtime
14.68s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 159.05s of the 492.26s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5406 = Validation score (-root_mean_squared_error)
49.5s = Training runtime
3.23s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 104.42s of the 437.63s of remaining time.
-38.2831 = Validation score (-root_mean_squared_error)
16.7s = Training runtime
0.63s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 84.07s of the 417.29s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-35.9065 = Validation score (-root_mean_squared_error)
80.94s = Training runtime
0.11s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 332.06s of remaining time.
-31.7162 = Validation score (-root_mean_squared_error)
0.43s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 220.97s of the 331.53s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.9935 = Validation score (-root_mean_squared_error)
29.97s = Training runtime
0.75s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 186.37s of the 296.92s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.3737 = Validation score (-root_mean_squared_error)
27.33s = Training runtime
0.36s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 154.6s of the 265.15s of remaining time.
-31.5167 = Validation score (-root_mean_squared_error)
30.03s = Training runtime
0.66s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 121.53s of the 232.08s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.4599 = Validation score (-root_mean_squared_error)
109.74s = Training runtime
0.13s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L2 ... Training model for up to 7.52s of the 118.08s of remaining time.
-31.203 = Validation score (-root_mean_squared_error)
10.1s = Training runtime
0.66s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the 104.72s of remaining time.
-30.0738 = Validation score (-root_mean_squared_error)
0.34s = Training runtime
0.0s = Validation runtime
Fitting 9 L3 models ...
Fitting model: LightGBMXT_BAG_L3 ... Training model for up to 104.29s of the 104.27s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.67 = Validation score (-root_mean_squared_error)
24.58s = Training runtime
0.21s = Validation runtime
Fitting model: LightGBM_BAG_L3 ... Training model for up to 75.68s of the 75.67s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.0308 = Validation score (-root_mean_squared_error)
23.89s = Training runtime
0.14s = Validation runtime
Fitting model: RandomForestMSE_BAG_L3 ... Training model for up to 47.16s of the 47.14s of remaining time.
-31.5774 = Validation score (-root_mean_squared_error)
29.64s = Training runtime
0.71s = Validation runtime
Fitting model: CatBoost_BAG_L3 ... Training model for up to 14.45s of the 14.43s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.6964 = Validation score (-root_mean_squared_error)
25.23s = Training runtime
0.11s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L4 ... Training model for up to 360.0s of the -15.06s of remaining time.
-30.598 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 615.58s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_131328/")
predictor_new_hpo_3a.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.073807 20.757755 421.973336 0.000817 0.342283 3 True 13
1 LightGBM_BAG_L2 -30.373742 19.214392 271.819600 0.356215 27.334411 2 True 9
2 CatBoost_BAG_L2 -30.459913 18.991939 354.226140 0.133763 109.740951 2 True 11
3 WeightedEnsemble_L4 -30.597974 22.379541 530.709490 0.000849 0.294913 4 True 18
4 CatBoost_BAG_L3 -30.696424 21.527538 476.885877 0.105708 25.225553 3 True 17
5 LightGBMXT_BAG_L2 -30.993537 19.610932 274.453940 0.752755 29.968751 2 True 8
6 LightGBM_BAG_L3 -31.030774 21.562387 475.549113 0.140558 23.888788 3 True 15
7 ExtraTreesMSE_BAG_L2 -31.203035 19.514205 254.586939 0.656029 10.101751 2 True 12
8 RandomForestMSE_BAG_L2 -31.516652 19.523067 274.514460 0.664891 30.029272 2 True 10
9 RandomForestMSE_BAG_L3 -31.577350 22.132426 481.300236 0.710597 29.639912 3 True 16
10 LightGBMXT_BAG_L3 -31.670032 21.628945 476.239813 0.207116 24.579489 3 True 14
11 WeightedEnsemble_L2 -31.716206 18.751797 244.867637 0.000946 0.428463 2 True 7
12 LightGBMXT_BAG_L1 -32.972358 14.677035 97.267682 14.677035 97.267682 1 True 3
13 LightGBM_BAG_L1 -33.540630 3.233115 49.496090 3.233115 49.496090 1 True 4
14 CatBoost_BAG_L1 -35.906492 0.108761 80.936460 0.108761 80.936460 1 True 6
15 RandomForestMSE_BAG_L1 -38.283140 0.627138 16.697175 0.627138 16.697175 1 True 5
16 KNeighborsDist_BAG_L1 -84.125061 0.104802 0.041766 0.104802 0.041766 1 True 2
17 KNeighborsUnif_BAG_L1 -101.546199 0.107326 0.046016 0.107326 0.046016 1 True 1
Number of models trained: 18
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 4 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_131328/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L2': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L3': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L3': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L3': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L3': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L4': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -32.972357766618615,
'LightGBM_BAG_L1': -33.54062969122618,
'RandomForestMSE_BAG_L1': -38.28313968009453,
'CatBoost_BAG_L1': -35.90649192025656,
'WeightedEnsemble_L2': -31.71620633302866,
'LightGBMXT_BAG_L2': -30.99353658659002,
'LightGBM_BAG_L2': -30.37374165494332,
'RandomForestMSE_BAG_L2': -31.51665179898909,
'CatBoost_BAG_L2': -30.45991336506166,
'ExtraTreesMSE_BAG_L2': -31.203034705663896,
'WeightedEnsemble_L3': -30.073807135660008,
'LightGBMXT_BAG_L3': -31.670031964593548,
'LightGBM_BAG_L3': -31.030774075112422,
'RandomForestMSE_BAG_L3': -31.57735006240393,
'CatBoost_BAG_L3': -30.696424216626387,
'WeightedEnsemble_L4': -30.597973671430214},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_131328/models/CatBoost_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_131328/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_131328/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_131328/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_131328/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_131328/models/CatBoost_BAG_L2/',
'ExtraTreesMSE_BAG_L2': 'AutogluonModels/ag-20221226_131328/models/ExtraTreesMSE_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_131328/models/WeightedEnsemble_L3/',
'LightGBMXT_BAG_L3': 'AutogluonModels/ag-20221226_131328/models/LightGBMXT_BAG_L3/',
'LightGBM_BAG_L3': 'AutogluonModels/ag-20221226_131328/models/LightGBM_BAG_L3/',
'RandomForestMSE_BAG_L3': 'AutogluonModels/ag-20221226_131328/models/RandomForestMSE_BAG_L3/',
'CatBoost_BAG_L3': 'AutogluonModels/ag-20221226_131328/models/CatBoost_BAG_L3/',
'WeightedEnsemble_L4': 'AutogluonModels/ag-20221226_131328/models/WeightedEnsemble_L4/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.04601550102233887,
'KNeighborsDist_BAG_L1': 0.04176592826843262,
'LightGBMXT_BAG_L1': 97.26768207550049,
'LightGBM_BAG_L1': 49.496089696884155,
'RandomForestMSE_BAG_L1': 16.69717526435852,
'CatBoost_BAG_L1': 80.93646025657654,
'WeightedEnsemble_L2': 0.4284634590148926,
'LightGBMXT_BAG_L2': 29.968751430511475,
'LightGBM_BAG_L2': 27.334410905838013,
'RandomForestMSE_BAG_L2': 30.029271602630615,
'CatBoost_BAG_L2': 109.74095106124878,
'ExtraTreesMSE_BAG_L2': 10.101750612258911,
'WeightedEnsemble_L3': 0.342282772064209,
'LightGBMXT_BAG_L3': 24.57948899269104,
'LightGBM_BAG_L3': 23.8887882232666,
'RandomForestMSE_BAG_L3': 29.639911890029907,
'CatBoost_BAG_L3': 25.225552558898926,
'WeightedEnsemble_L4': 0.29491329193115234},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10732555389404297,
'KNeighborsDist_BAG_L1': 0.10480237007141113,
'LightGBMXT_BAG_L1': 14.677034616470337,
'LightGBM_BAG_L1': 3.233114719390869,
'RandomForestMSE_BAG_L1': 0.6271383762359619,
'CatBoost_BAG_L1': 0.10876083374023438,
'WeightedEnsemble_L2': 0.000946044921875,
'LightGBMXT_BAG_L2': 0.7527554035186768,
'LightGBM_BAG_L2': 0.356215238571167,
'RandomForestMSE_BAG_L2': 0.6648910045623779,
'CatBoost_BAG_L2': 0.13376283645629883,
'ExtraTreesMSE_BAG_L2': 0.6560285091400146,
'WeightedEnsemble_L3': 0.0008165836334228516,
'LightGBMXT_BAG_L3': 0.20711588859558105,
'LightGBM_BAG_L3': 0.14055800437927246,
'RandomForestMSE_BAG_L3': 0.7105965614318848,
'CatBoost_BAG_L3': 0.10570812225341797,
'WeightedEnsemble_L4': 0.0008492469787597656},
'num_bag_folds': 8,
'max_stack_level': 4,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L4': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.073807 20.757755 421.973336
1 LightGBM_BAG_L2 -30.373742 19.214392 271.819600
2 CatBoost_BAG_L2 -30.459913 18.991939 354.226140
3 WeightedEnsemble_L4 -30.597974 22.379541 530.709490
4 CatBoost_BAG_L3 -30.696424 21.527538 476.885877
5 LightGBMXT_BAG_L2 -30.993537 19.610932 274.453940
6 LightGBM_BAG_L3 -31.030774 21.562387 475.549113
7 ExtraTreesMSE_BAG_L2 -31.203035 19.514205 254.586939
8 RandomForestMSE_BAG_L2 -31.516652 19.523067 274.514460
9 RandomForestMSE_BAG_L3 -31.577350 22.132426 481.300236
10 LightGBMXT_BAG_L3 -31.670032 21.628945 476.239813
11 WeightedEnsemble_L2 -31.716206 18.751797 244.867637
12 LightGBMXT_BAG_L1 -32.972358 14.677035 97.267682
13 LightGBM_BAG_L1 -33.540630 3.233115 49.496090
14 CatBoost_BAG_L1 -35.906492 0.108761 80.936460
15 RandomForestMSE_BAG_L1 -38.283140 0.627138 16.697175
16 KNeighborsDist_BAG_L1 -84.125061 0.104802 0.041766
17 KNeighborsUnif_BAG_L1 -101.546199 0.107326 0.046016
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000817 0.342283 3 True
1 0.356215 27.334411 2 True
2 0.133763 109.740951 2 True
3 0.000849 0.294913 4 True
4 0.105708 25.225553 3 True
5 0.752755 29.968751 2 True
6 0.140558 23.888788 3 True
7 0.656029 10.101751 2 True
8 0.664891 30.029272 2 True
9 0.710597 29.639912 3 True
10 0.207116 24.579489 3 True
11 0.000946 0.428463 2 True
12 14.677035 97.267682 1 True
13 3.233115 49.496090 1 True
14 0.108761 80.936460 1 True
15 0.627138 16.697175 1 True
16 0.104802 0.041766 1 True
17 0.107326 0.046016 1 True
fit_order
0 13
1 9
2 11
3 18
4 17
5 8
6 15
7 12
8 10
9 16
10 14
11 7
12 3
13 4
14 6
15 5
16 2
17 1 }
predictor_new_hpo_3a.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -30.073807 | 20.757755 | 421.973336 | 0.000817 | 0.342283 | 3 | True | 13 |
| 1 | LightGBM_BAG_L2 | -30.373742 | 19.214392 | 271.819600 | 0.356215 | 27.334411 | 2 | True | 9 |
| 2 | CatBoost_BAG_L2 | -30.459913 | 18.991939 | 354.226140 | 0.133763 | 109.740951 | 2 | True | 11 |
| 3 | WeightedEnsemble_L4 | -30.597974 | 22.379541 | 530.709490 | 0.000849 | 0.294913 | 4 | True | 18 |
| 4 | CatBoost_BAG_L3 | -30.696424 | 21.527538 | 476.885877 | 0.105708 | 25.225553 | 3 | True | 17 |
| 5 | LightGBMXT_BAG_L2 | -30.993537 | 19.610932 | 274.453940 | 0.752755 | 29.968751 | 2 | True | 8 |
| 6 | LightGBM_BAG_L3 | -31.030774 | 21.562387 | 475.549113 | 0.140558 | 23.888788 | 3 | True | 15 |
| 7 | ExtraTreesMSE_BAG_L2 | -31.203035 | 19.514205 | 254.586939 | 0.656029 | 10.101751 | 2 | True | 12 |
| 8 | RandomForestMSE_BAG_L2 | -31.516652 | 19.523067 | 274.514460 | 0.664891 | 30.029272 | 2 | True | 10 |
| 9 | RandomForestMSE_BAG_L3 | -31.577350 | 22.132426 | 481.300236 | 0.710597 | 29.639912 | 3 | True | 16 |
| 10 | LightGBMXT_BAG_L3 | -31.670032 | 21.628945 | 476.239813 | 0.207116 | 24.579489 | 3 | True | 14 |
| 11 | WeightedEnsemble_L2 | -31.716206 | 18.751797 | 244.867637 | 0.000946 | 0.428463 | 2 | True | 7 |
| 12 | LightGBMXT_BAG_L1 | -32.972358 | 14.677035 | 97.267682 | 14.677035 | 97.267682 | 1 | True | 3 |
| 13 | LightGBM_BAG_L1 | -33.540630 | 3.233115 | 49.496090 | 3.233115 | 49.496090 | 1 | True | 4 |
| 14 | CatBoost_BAG_L1 | -35.906492 | 0.108761 | 80.936460 | 0.108761 | 80.936460 | 1 | True | 6 |
| 15 | RandomForestMSE_BAG_L1 | -38.283140 | 0.627138 | 16.697175 | 0.627138 | 16.697175 | 1 | True | 5 |
| 16 | KNeighborsDist_BAG_L1 | -84.125061 | 0.104802 | 0.041766 | 0.104802 | 0.041766 | 1 | True | 2 |
| 17 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.107326 | 0.046016 | 0.107326 | 0.046016 | 1 | True | 1 |
fig = predictor_new_hpo_3a.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_3a_leaderboard.png')
# Remember to set all negative values to zero
predictions_new_hpo_3a = predictor_new_hpo_3a.predict(test)
predictions_new_hpo_3a.describe()
count 6493.000000 mean 156.263962 std 136.127396 min 2.221345 25% 51.584747 50% 119.469810 75% 222.454544 max 803.260193 Name: count, dtype: float64
(predictions_new_hpo_3a<0).sum()
0
predictions_new_hpo_3a = predictions_new_hpo_3a.apply(lambda x: 0 if x<0 else x)
predictions_new_hpo_3a.describe()
count 6493.000000 mean 156.263971 std 136.127393 min 2.221345 25% 51.584747 50% 119.469810 75% 222.454544 max 803.260193 Name: count, dtype: float64
submission_new_hpo_3a = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_hpo_3a["count"] = predictions_new_hpo_3a
submission_new_hpo_3a.to_csv("submission_new_hpo_3a.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo_3a.csv -m "hpo 3a num_stack_levels = 2"
100%|█████████████████████████████████████████| 243k/243k [00:00<00:00, 470kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore ------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------ submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835 submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357 submission_new_features_2a.csv 2022-12-26 12:35:34 new features 2a complete 0.62078 0.62078 submission.csv 2022-12-26 12:10:34 initial submission 1 complete 1.79067 1.79067 tail: error writing 'standard output': Broken pipe
predictor_new_hpo_3b = TabularPredictor(
label = 'count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train,
presets='best_quality',
num_bag_folds = 10,
time_limit = 600,
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_132437/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=10, num_bag_sets=20
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_132437/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 16
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 5743.87 MB
Train Data (Original) Memory Usage: 1.17 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 9 | ['holiday', 'workingday', 'humidity', 'datetime_hour', 'datetime_day', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.2s = Fit runtime
16 features in original data used to generate 20 features in processed data.
Train Data (Processed) Memory Usage: 1.29 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.26s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.72s of the 599.73s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.05s = Training runtime
0.1s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.32s of the 599.33s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.94s of the 598.94s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-32.5991 = Validation score (-root_mean_squared_error)
108.22s = Training runtime
12.56s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 283.08s of the 483.08s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-32.9398 = Validation score (-root_mean_squared_error)
57.67s = Training runtime
2.75s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 220.02s of the 420.02s of remaining time.
-38.2831 = Validation score (-root_mean_squared_error)
16.4s = Training runtime
0.64s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 200.4s of the 400.41s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-34.3091 = Validation score (-root_mean_squared_error)
179.48s = Training runtime
0.21s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 16.52s of the 216.53s of remaining time.
-37.4237 = Validation score (-root_mean_squared_error)
7.58s = Training runtime
0.63s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 5.69s of the 205.7s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
Time limit exceeded... Skipping NeuralNetFastAI_BAG_L1.
2022-12-26 13:31:20,732 ERROR worker.py:400 -- Unhandled error (suppress with 'RAY_IGNORE_UNHANDLED_ERRORS=1'): The worker died unexpectedly while executing this task. Check python-core-worker-*.log files for more information.
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 196.61s of remaining time.
-31.3821 = Validation score (-root_mean_squared_error)
0.79s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 195.72s of the 195.69s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-30.7377 = Validation score (-root_mean_squared_error)
40.78s = Training runtime
0.85s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 150.6s of the 150.58s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-30.0949 = Validation score (-root_mean_squared_error)
35.52s = Training runtime
0.37s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 110.6s of the 110.57s of remaining time.
-31.5293 = Validation score (-root_mean_squared_error)
32.15s = Training runtime
0.68s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 75.37s of the 75.35s of remaining time.
Fitting 10 child models (S1F1 - S1F10) | Fitting with ParallelLocalFoldFittingStrategy
-30.3329 = Validation score (-root_mean_squared_error)
78.63s = Training runtime
0.17s = Validation runtime
Completed 1/20 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -7.41s of remaining time.
-29.8532 = Validation score (-root_mean_squared_error)
0.3s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 607.92s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_132437/")
predictor_new_hpo_3b.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -29.853200 18.382146 524.679103 0.001139 0.298670 3 True 13
1 LightGBM_BAG_L2 -30.094854 17.364107 404.967932 0.367318 35.518546 2 True 10
2 CatBoost_BAG_L2 -30.332858 17.167829 448.077864 0.171040 78.628478 2 True 12
3 LightGBMXT_BAG_L2 -30.737652 17.842649 410.233409 0.845860 40.784023 2 True 9
4 WeightedEnsemble_L2 -31.382073 16.264564 362.617095 0.001435 0.794605 2 True 8
5 RandomForestMSE_BAG_L2 -31.529282 17.674947 401.600474 0.678158 32.151088 2 True 11
6 LightGBMXT_BAG_L1 -32.599055 12.561143 108.223031 12.561143 108.223031 1 True 3
7 LightGBM_BAG_L1 -32.939787 2.748748 57.674279 2.748748 57.674279 1 True 4
8 CatBoost_BAG_L1 -34.309095 0.211043 179.481567 0.211043 179.481567 1 True 6
9 ExtraTreesMSE_BAG_L1 -37.423673 0.630543 7.581843 0.630543 7.581843 1 True 7
10 RandomForestMSE_BAG_L1 -38.283140 0.638364 16.402948 0.638364 16.402948 1 True 5
11 KNeighborsDist_BAG_L1 -84.125061 0.103830 0.040665 0.103830 0.040665 1 True 2
12 KNeighborsUnif_BAG_L1 -101.546199 0.103118 0.045053 0.103118 0.045053 1 True 1
Number of models trained: 13
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 10 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_132437/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -32.599055251891826,
'LightGBM_BAG_L1': -32.9397869637605,
'RandomForestMSE_BAG_L1': -38.28313968009453,
'CatBoost_BAG_L1': -34.30909469417376,
'ExtraTreesMSE_BAG_L1': -37.42367272491501,
'WeightedEnsemble_L2': -31.382072981034113,
'LightGBMXT_BAG_L2': -30.73765244011092,
'LightGBM_BAG_L2': -30.094853502697227,
'RandomForestMSE_BAG_L2': -31.52928167241682,
'CatBoost_BAG_L2': -30.332857556154007,
'WeightedEnsemble_L3': -29.85320041922994},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221226_132437/models/ExtraTreesMSE_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_132437/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_132437/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_132437/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_132437/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_132437/models/CatBoost_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_132437/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.04505300521850586,
'KNeighborsDist_BAG_L1': 0.0406646728515625,
'LightGBMXT_BAG_L1': 108.22303080558777,
'LightGBM_BAG_L1': 57.67427921295166,
'RandomForestMSE_BAG_L1': 16.4029483795166,
'CatBoost_BAG_L1': 179.48156690597534,
'ExtraTreesMSE_BAG_L1': 7.58184289932251,
'WeightedEnsemble_L2': 0.7946052551269531,
'LightGBMXT_BAG_L2': 40.78402328491211,
'LightGBM_BAG_L2': 35.518545627593994,
'RandomForestMSE_BAG_L2': 32.15108823776245,
'CatBoost_BAG_L2': 78.62847805023193,
'WeightedEnsemble_L3': 0.29867029190063477},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10311770439147949,
'KNeighborsDist_BAG_L1': 0.10383033752441406,
'LightGBMXT_BAG_L1': 12.561143398284912,
'LightGBM_BAG_L1': 2.7487478256225586,
'RandomForestMSE_BAG_L1': 0.6383638381958008,
'CatBoost_BAG_L1': 0.2110428810119629,
'ExtraTreesMSE_BAG_L1': 0.6305429935455322,
'WeightedEnsemble_L2': 0.0014352798461914062,
'LightGBMXT_BAG_L2': 0.8458602428436279,
'LightGBM_BAG_L2': 0.36731791496276855,
'RandomForestMSE_BAG_L2': 0.6781578063964844,
'CatBoost_BAG_L2': 0.17103958129882812,
'WeightedEnsemble_L3': 0.0011394023895263672},
'num_bag_folds': 10,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -29.853200 18.382146 524.679103
1 LightGBM_BAG_L2 -30.094854 17.364107 404.967932
2 CatBoost_BAG_L2 -30.332858 17.167829 448.077864
3 LightGBMXT_BAG_L2 -30.737652 17.842649 410.233409
4 WeightedEnsemble_L2 -31.382073 16.264564 362.617095
5 RandomForestMSE_BAG_L2 -31.529282 17.674947 401.600474
6 LightGBMXT_BAG_L1 -32.599055 12.561143 108.223031
7 LightGBM_BAG_L1 -32.939787 2.748748 57.674279
8 CatBoost_BAG_L1 -34.309095 0.211043 179.481567
9 ExtraTreesMSE_BAG_L1 -37.423673 0.630543 7.581843
10 RandomForestMSE_BAG_L1 -38.283140 0.638364 16.402948
11 KNeighborsDist_BAG_L1 -84.125061 0.103830 0.040665
12 KNeighborsUnif_BAG_L1 -101.546199 0.103118 0.045053
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.001139 0.298670 3 True
1 0.367318 35.518546 2 True
2 0.171040 78.628478 2 True
3 0.845860 40.784023 2 True
4 0.001435 0.794605 2 True
5 0.678158 32.151088 2 True
6 12.561143 108.223031 1 True
7 2.748748 57.674279 1 True
8 0.211043 179.481567 1 True
9 0.630543 7.581843 1 True
10 0.638364 16.402948 1 True
11 0.103830 0.040665 1 True
12 0.103118 0.045053 1 True
fit_order
0 13
1 10
2 12
3 9
4 8
5 11
6 3
7 4
8 6
9 7
10 5
11 2
12 1 }
predictor_new_hpo_3b.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -29.853200 | 18.382146 | 524.679103 | 0.001139 | 0.298670 | 3 | True | 13 |
| 1 | LightGBM_BAG_L2 | -30.094854 | 17.364107 | 404.967932 | 0.367318 | 35.518546 | 2 | True | 10 |
| 2 | CatBoost_BAG_L2 | -30.332858 | 17.167829 | 448.077864 | 0.171040 | 78.628478 | 2 | True | 12 |
| 3 | LightGBMXT_BAG_L2 | -30.737652 | 17.842649 | 410.233409 | 0.845860 | 40.784023 | 2 | True | 9 |
| 4 | WeightedEnsemble_L2 | -31.382073 | 16.264564 | 362.617095 | 0.001435 | 0.794605 | 2 | True | 8 |
| 5 | RandomForestMSE_BAG_L2 | -31.529282 | 17.674947 | 401.600474 | 0.678158 | 32.151088 | 2 | True | 11 |
| 6 | LightGBMXT_BAG_L1 | -32.599055 | 12.561143 | 108.223031 | 12.561143 | 108.223031 | 1 | True | 3 |
| 7 | LightGBM_BAG_L1 | -32.939787 | 2.748748 | 57.674279 | 2.748748 | 57.674279 | 1 | True | 4 |
| 8 | CatBoost_BAG_L1 | -34.309095 | 0.211043 | 179.481567 | 0.211043 | 179.481567 | 1 | True | 6 |
| 9 | ExtraTreesMSE_BAG_L1 | -37.423673 | 0.630543 | 7.581843 | 0.630543 | 7.581843 | 1 | True | 7 |
| 10 | RandomForestMSE_BAG_L1 | -38.283140 | 0.638364 | 16.402948 | 0.638364 | 16.402948 | 1 | True | 5 |
| 11 | KNeighborsDist_BAG_L1 | -84.125061 | 0.103830 | 0.040665 | 0.103830 | 0.040665 | 1 | True | 2 |
| 12 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.103118 | 0.045053 | 0.103118 | 0.045053 | 1 | True | 1 |
fig = predictor_new_hpo_3b.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_3b_leaderboard.png')
# Remember to set all negative values to zero
predictions_new_hpo_3b = predictor_new_hpo_3b.predict(test)
predictions_new_hpo_3b.describe()
count 6493.000000 mean 160.360275 std 139.554337 min 2.212470 25% 52.166279 50% 124.043221 75% 229.185211 max 804.144897 Name: count, dtype: float64
(predictions_new_hpo_3b<0).sum()
0
predictions_new_hpo_3b = predictions_new_hpo_3b.apply(lambda x: 0 if x<0 else x)
predictions_new_hpo_3b.describe()
count 6493.000000 mean 160.360273 std 139.554337 min 2.212470 25% 52.166279 50% 124.043221 75% 229.185211 max 804.144897 Name: count, dtype: float64
submission_new_hpo_3b = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_hpo_3b["count"] = predictions_new_hpo_3b
submission_new_hpo_3b.to_csv("submission_new_hpo_3b.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo_3b.csv -m "hpo 3b num_bag_folds = 10"
100%|█████████████████████████████████████████| 243k/243k [00:00<00:00, 477kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission_new_hpo_3b.csv 2022-12-26 13:35:38 hpo 3b num_bag_folds = 10 complete 0.63100 0.63100
submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835
submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357
submission_new_features_2a.csv 2022-12-26 12:35:34 new features 2a complete 0.62078 0.62078
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
predictor_new_hpo_3c = TabularPredictor(
label = 'count',
eval_metric = 'root_mean_squared_error',
).fit(
train_data = train,
presets='best_quality',
num_bag_sets = 5,
time_limit = 600,
)
No path specified. Models will be saved in: "AutogluonModels/ag-20221226_133540/"
Presets specified: ['best_quality']
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=5
Beginning AutoGluon training ... Time limit = 600s
AutoGluon will save models to "AutogluonModels/ag-20221226_133540/"
AutoGluon Version: 0.6.1
Python Version: 3.7.10
Operating System: Linux
Platform Machine: x86_64
Platform Version: #1 SMP Wed Oct 26 20:36:53 UTC 2022
Train Data Rows: 10886
Train Data Columns: 16
Label Column: count
Preprocessing data ...
AutoGluon infers your prediction problem is: 'regression' (because dtype of label-column == int and many unique label-values observed).
Label info (max, min, mean, stddev): (977, 1, 191.57413, 181.14445)
If 'regression' is not the correct problem_type, please manually specify the problem_type parameter during predictor init (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
Available Memory: 5786.94 MB
Train Data (Original) Memory Usage: 1.17 MB (0.0% of available memory)
Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
Stage 1 Generators:
Fitting AsTypeFeatureGenerator...
Note: Converting 3 features to boolean dtype as they only contain 2 unique values.
Stage 2 Generators:
Fitting FillNaFeatureGenerator...
Stage 3 Generators:
Fitting IdentityFeatureGenerator...
Fitting CategoryFeatureGenerator...
Fitting CategoryMemoryMinimizeFeatureGenerator...
Fitting DatetimeFeatureGenerator...
/usr/local/lib/python3.7/site-packages/autogluon/features/generators/datetime.py:59: FutureWarning: casting datetime64[ns, UTC] values to int64 with .astype(...) is deprecated and will raise in a future version. Use .view(...) instead.
good_rows = series[~series.isin(bad_rows)].astype(np.int64)
Stage 4 Generators:
Fitting DropUniqueFeatureGenerator...
Types of features in original data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('datetime', []) : 1 | ['datetime']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 9 | ['holiday', 'workingday', 'humidity', 'datetime_hour', 'datetime_day', ...]
Types of features in processed data (raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
0.2s = Fit runtime
16 features in original data used to generate 20 features in processed data.
Train Data (Processed) Memory Usage: 1.29 MB (0.0% of available memory)
Data preprocessing and feature engineering runtime = 0.26s ...
AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'
This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.
To change this, specify the eval_metric parameter of Predictor()
AutoGluon will fit 2 stack levels (L1 to L2) ...
Fitting 11 L1 models ...
Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 399.73s of the 599.73s of remaining time.
-101.5462 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.11s = Validation runtime
Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 399.34s of the 599.35s of remaining time.
-84.1251 = Validation score (-root_mean_squared_error)
0.04s = Training runtime
0.1s = Validation runtime
Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 398.96s of the 598.96s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-32.9724 = Validation score (-root_mean_squared_error)
97.28s = Training runtime
13.73s = Validation runtime
Fitting model: LightGBM_BAG_L1 ... Training model for up to 292.97s of the 492.97s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.5406 = Validation score (-root_mean_squared_error)
48.93s = Training runtime
3.24s = Validation runtime
Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 238.94s of the 438.95s of remaining time.
-38.2831 = Validation score (-root_mean_squared_error)
16.3s = Training runtime
0.65s = Validation runtime
Fitting model: CatBoost_BAG_L1 ... Training model for up to 219.42s of the 419.43s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-33.9002 = Validation score (-root_mean_squared_error)
189.89s = Training runtime
0.19s = Validation runtime
Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 25.38s of the 225.39s of remaining time.
-37.4237 = Validation score (-root_mean_squared_error)
7.49s = Training runtime
0.62s = Validation runtime
Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 14.74s of the 214.75s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-82.126 = Validation score (-root_mean_squared_error)
39.14s = Training runtime
0.85s = Validation runtime
Completed 1/5 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 171.26s of remaining time.
-31.6286 = Validation score (-root_mean_squared_error)
0.77s = Training runtime
0.0s = Validation runtime
Fitting 9 L2 models ...
Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 170.39s of the 170.37s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-31.0788 = Validation score (-root_mean_squared_error)
29.83s = Training runtime
0.6s = Validation runtime
Fitting model: LightGBM_BAG_L2 ... Training model for up to 135.72s of the 135.7s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.4329 = Validation score (-root_mean_squared_error)
27.99s = Training runtime
0.29s = Validation runtime
Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 102.82s of the 102.8s of remaining time.
-31.3065 = Validation score (-root_mean_squared_error)
34.96s = Training runtime
0.69s = Validation runtime
Fitting model: CatBoost_BAG_L2 ... Training model for up to 64.78s of the 64.76s of remaining time.
Fitting 8 child models (S1F1 - S1F8) | Fitting with ParallelLocalFoldFittingStrategy
-30.6277 = Validation score (-root_mean_squared_error)
65.77s = Training runtime
0.13s = Validation runtime
Completed 1/5 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the -5.39s of remaining time.
-30.1692 = Validation score (-root_mean_squared_error)
0.29s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 605.89s ... Best model: "WeightedEnsemble_L3"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_133540/")
predictor_new_hpo_3c.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L3 -30.169168 21.202884 557.956483 0.000837 0.289357 3 True 14
1 LightGBM_BAG_L2 -30.432906 19.787925 427.108798 0.289943 27.991384 2 True 11
2 CatBoost_BAG_L2 -30.627700 19.627053 464.884130 0.129071 65.766716 2 True 13
3 LightGBMXT_BAG_L2 -31.078828 20.097915 428.952320 0.599934 29.834905 2 True 10
4 RandomForestMSE_BAG_L2 -31.306516 20.183100 434.074120 0.685118 34.956706 2 True 12
5 WeightedEnsemble_L2 -31.628556 17.923118 353.212819 0.001369 0.766741 2 True 9
6 LightGBMXT_BAG_L1 -32.972358 13.731958 97.282146 13.731958 97.282146 1 True 3
7 LightGBM_BAG_L1 -33.540630 3.239727 48.931604 3.239727 48.931604 1 True 4
8 CatBoost_BAG_L1 -33.900201 0.194296 189.893482 0.194296 189.893482 1 True 6
9 ExtraTreesMSE_BAG_L1 -37.423673 0.616393 7.486015 0.616393 7.486015 1 True 7
10 RandomForestMSE_BAG_L1 -38.283140 0.651644 16.297994 0.651644 16.297994 1 True 5
11 NeuralNetFastAI_BAG_L1 -82.125954 0.853284 39.143703 0.853284 39.143703 1 True 8
12 KNeighborsDist_BAG_L1 -84.125061 0.104123 0.040852 0.104123 0.040852 1 True 2
13 KNeighborsUnif_BAG_L1 -101.546199 0.106556 0.041618 0.106556 0.041618 1 True 1
Number of models trained: 14
Types of models trained:
{'StackerEnsembleModel_NNFastAiTabular', 'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_KNN', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_133540/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'KNeighborsUnif_BAG_L1': 'StackerEnsembleModel_KNN',
'KNeighborsDist_BAG_L1': 'StackerEnsembleModel_KNN',
'LightGBMXT_BAG_L1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1': 'StackerEnsembleModel_CatBoost',
'ExtraTreesMSE_BAG_L1': 'StackerEnsembleModel_XT',
'NeuralNetFastAI_BAG_L1': 'StackerEnsembleModel_NNFastAiTabular',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBMXT_BAG_L2': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L2': 'StackerEnsembleModel_LGB',
'RandomForestMSE_BAG_L2': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2': 'StackerEnsembleModel_CatBoost',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'KNeighborsUnif_BAG_L1': -101.54619908446061,
'KNeighborsDist_BAG_L1': -84.12506123181602,
'LightGBMXT_BAG_L1': -32.972357766618615,
'LightGBM_BAG_L1': -33.54062969122618,
'RandomForestMSE_BAG_L1': -38.28313968009453,
'CatBoost_BAG_L1': -33.90020115203571,
'ExtraTreesMSE_BAG_L1': -37.42367272491501,
'NeuralNetFastAI_BAG_L1': -82.12595438203813,
'WeightedEnsemble_L2': -31.628556349538457,
'LightGBMXT_BAG_L2': -31.078827596661277,
'LightGBM_BAG_L2': -30.43290636169063,
'RandomForestMSE_BAG_L2': -31.306515569548793,
'CatBoost_BAG_L2': -30.627699888049435,
'WeightedEnsemble_L3': -30.169168481222528},
'model_best': 'WeightedEnsemble_L3',
'model_paths': {'KNeighborsUnif_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/KNeighborsUnif_BAG_L1/',
'KNeighborsDist_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/KNeighborsDist_BAG_L1/',
'LightGBMXT_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/LightGBMXT_BAG_L1/',
'LightGBM_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/LightGBM_BAG_L1/',
'RandomForestMSE_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/RandomForestMSE_BAG_L1/',
'CatBoost_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/CatBoost_BAG_L1/',
'ExtraTreesMSE_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/ExtraTreesMSE_BAG_L1/',
'NeuralNetFastAI_BAG_L1': 'AutogluonModels/ag-20221226_133540/models/NeuralNetFastAI_BAG_L1/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_133540/models/WeightedEnsemble_L2/',
'LightGBMXT_BAG_L2': 'AutogluonModels/ag-20221226_133540/models/LightGBMXT_BAG_L2/',
'LightGBM_BAG_L2': 'AutogluonModels/ag-20221226_133540/models/LightGBM_BAG_L2/',
'RandomForestMSE_BAG_L2': 'AutogluonModels/ag-20221226_133540/models/RandomForestMSE_BAG_L2/',
'CatBoost_BAG_L2': 'AutogluonModels/ag-20221226_133540/models/CatBoost_BAG_L2/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_133540/models/WeightedEnsemble_L3/'},
'model_fit_times': {'KNeighborsUnif_BAG_L1': 0.04161834716796875,
'KNeighborsDist_BAG_L1': 0.04085206985473633,
'LightGBMXT_BAG_L1': 97.28214597702026,
'LightGBM_BAG_L1': 48.93160367012024,
'RandomForestMSE_BAG_L1': 16.297994375228882,
'CatBoost_BAG_L1': 189.89348196983337,
'ExtraTreesMSE_BAG_L1': 7.4860148429870605,
'NeuralNetFastAI_BAG_L1': 39.1437029838562,
'WeightedEnsemble_L2': 0.7667412757873535,
'LightGBMXT_BAG_L2': 29.83490538597107,
'LightGBM_BAG_L2': 27.99138379096985,
'RandomForestMSE_BAG_L2': 34.956705808639526,
'CatBoost_BAG_L2': 65.76671624183655,
'WeightedEnsemble_L3': 0.28935742378234863},
'model_pred_times': {'KNeighborsUnif_BAG_L1': 0.10655593872070312,
'KNeighborsDist_BAG_L1': 0.10412335395812988,
'LightGBMXT_BAG_L1': 13.731958150863647,
'LightGBM_BAG_L1': 3.23972749710083,
'RandomForestMSE_BAG_L1': 0.651644229888916,
'CatBoost_BAG_L1': 0.19429564476013184,
'ExtraTreesMSE_BAG_L1': 0.6163928508758545,
'NeuralNetFastAI_BAG_L1': 0.8532841205596924,
'WeightedEnsemble_L2': 0.001369476318359375,
'LightGBMXT_BAG_L2': 0.5999336242675781,
'LightGBM_BAG_L2': 0.28994297981262207,
'RandomForestMSE_BAG_L2': 0.6851177215576172,
'CatBoost_BAG_L2': 0.12907099723815918,
'WeightedEnsemble_L3': 0.0008366107940673828},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'KNeighborsUnif_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'KNeighborsDist_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'LightGBMXT_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTreesMSE_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'NeuralNetFastAI_BAG_L1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBMXT_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForestMSE_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L3 -30.169168 21.202884 557.956483
1 LightGBM_BAG_L2 -30.432906 19.787925 427.108798
2 CatBoost_BAG_L2 -30.627700 19.627053 464.884130
3 LightGBMXT_BAG_L2 -31.078828 20.097915 428.952320
4 RandomForestMSE_BAG_L2 -31.306516 20.183100 434.074120
5 WeightedEnsemble_L2 -31.628556 17.923118 353.212819
6 LightGBMXT_BAG_L1 -32.972358 13.731958 97.282146
7 LightGBM_BAG_L1 -33.540630 3.239727 48.931604
8 CatBoost_BAG_L1 -33.900201 0.194296 189.893482
9 ExtraTreesMSE_BAG_L1 -37.423673 0.616393 7.486015
10 RandomForestMSE_BAG_L1 -38.283140 0.651644 16.297994
11 NeuralNetFastAI_BAG_L1 -82.125954 0.853284 39.143703
12 KNeighborsDist_BAG_L1 -84.125061 0.104123 0.040852
13 KNeighborsUnif_BAG_L1 -101.546199 0.106556 0.041618
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000837 0.289357 3 True
1 0.289943 27.991384 2 True
2 0.129071 65.766716 2 True
3 0.599934 29.834905 2 True
4 0.685118 34.956706 2 True
5 0.001369 0.766741 2 True
6 13.731958 97.282146 1 True
7 3.239727 48.931604 1 True
8 0.194296 189.893482 1 True
9 0.616393 7.486015 1 True
10 0.651644 16.297994 1 True
11 0.853284 39.143703 1 True
12 0.104123 0.040852 1 True
13 0.106556 0.041618 1 True
fit_order
0 14
1 11
2 13
3 10
4 12
5 9
6 3
7 4
8 6
9 7
10 5
11 8
12 2
13 1 }
predictor_new_hpo_3c.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L3 | -30.169168 | 21.202884 | 557.956483 | 0.000837 | 0.289357 | 3 | True | 14 |
| 1 | LightGBM_BAG_L2 | -30.432906 | 19.787925 | 427.108798 | 0.289943 | 27.991384 | 2 | True | 11 |
| 2 | CatBoost_BAG_L2 | -30.627700 | 19.627053 | 464.884130 | 0.129071 | 65.766716 | 2 | True | 13 |
| 3 | LightGBMXT_BAG_L2 | -31.078828 | 20.097915 | 428.952320 | 0.599934 | 29.834905 | 2 | True | 10 |
| 4 | RandomForestMSE_BAG_L2 | -31.306516 | 20.183100 | 434.074120 | 0.685118 | 34.956706 | 2 | True | 12 |
| 5 | WeightedEnsemble_L2 | -31.628556 | 17.923118 | 353.212819 | 0.001369 | 0.766741 | 2 | True | 9 |
| 6 | LightGBMXT_BAG_L1 | -32.972358 | 13.731958 | 97.282146 | 13.731958 | 97.282146 | 1 | True | 3 |
| 7 | LightGBM_BAG_L1 | -33.540630 | 3.239727 | 48.931604 | 3.239727 | 48.931604 | 1 | True | 4 |
| 8 | CatBoost_BAG_L1 | -33.900201 | 0.194296 | 189.893482 | 0.194296 | 189.893482 | 1 | True | 6 |
| 9 | ExtraTreesMSE_BAG_L1 | -37.423673 | 0.616393 | 7.486015 | 0.616393 | 7.486015 | 1 | True | 7 |
| 10 | RandomForestMSE_BAG_L1 | -38.283140 | 0.651644 | 16.297994 | 0.651644 | 16.297994 | 1 | True | 5 |
| 11 | NeuralNetFastAI_BAG_L1 | -82.125954 | 0.853284 | 39.143703 | 0.853284 | 39.143703 | 1 | True | 8 |
| 12 | KNeighborsDist_BAG_L1 | -84.125061 | 0.104123 | 0.040852 | 0.104123 | 0.040852 | 1 | True | 2 |
| 13 | KNeighborsUnif_BAG_L1 | -101.546199 | 0.106556 | 0.041618 | 0.106556 | 0.041618 | 1 | True | 1 |
fig = predictor_new_hpo_3c.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_3c_leaderboard.png')
# Remember to set all negative values to zero
predictions_new_hpo_3c = predictor_new_hpo_3c.predict(test)
predictions_new_hpo_3c.describe()
count 6493.000000 mean 160.876572 std 141.959457 min 2.557756 25% 49.907635 50% 122.510849 75% 229.880219 max 811.957947 Name: count, dtype: float64
(predictions_new_hpo_3c<0).sum()
0
predictions_new_hpo_3c = predictions_new_hpo_3c.apply(lambda x: 0 if x<0 else x)
predictions_new_hpo_3c.describe()
count 6493.000000 mean 160.876582 std 141.959455 min 2.557756 25% 49.907635 50% 122.510849 75% 229.880219 max 811.957947 Name: count, dtype: float64
submission_new_hpo_3c = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_hpo_3c["count"] = predictions_new_hpo_3c
submission_new_hpo_3c.to_csv("submission_new_hpo_3c.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo_3c.csv -m "hpo 3c num_bag_sets = 5"
100%|█████████████████████████████████████████| 243k/243k [00:00<00:00, 569kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission_new_hpo_3c.csv 2022-12-26 13:46:38 hpo 3c num_bag_sets = 5 complete 0.62247 0.62247
submission_new_hpo_3b.csv 2022-12-26 13:35:38 hpo 3b num_bag_folds = 10 complete 0.63100 0.63100
submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835
submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
from sklearn.model_selection import train_test_split
import autogluon.core as ag
train_split, val_split = train_test_split(train, test_size = 0.1, random_state = 0)
nn_options = {
'num_epochs': 10,
'learning_rate': ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
'activation': ag.space.Categorical('relu', 'softrelu', 'tanh'),
'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1),
}
gbm_options = {
'num_boost_round': ag.space.Int(lower = 100, upper = 1000),
'num_leaves': ag.space.Int(lower=26, upper=66, default=36),
}
rt_options = {
'n_estimators': ag.space.Int(lower =150,upper=500)
}
xt_options = {
'n_estimators': ag.space.Int(lower =150,upper=500)
}
cat_options = {
'iterations': ag.space.Int(lower =1000,upper=10000)
}
hyperparameters = { # hyperparameters of each model type
'GBM': gbm_options,
'NN_TORCH': nn_options, # NOTE: comment this line out if you get errors on Mac OSX
'RF': rt_options,
'XT': xt_options,
'CAT': cat_options,
} # When these keys are missing from hyperparameters dict, no models of that type are trained
time_limit = 10*60 # train various models for ~2 min
num_trials = 20 # try at most 5 different hyperparameter configurations for each type of model
search_strategy = 'bayes' # to tune hyperparameters using SKopt Bayesian optimization routine
label = 'count'
metric = 'root_mean_squared_error'
hyperparameter_tune_kwargs = { # HPO is not performed unless hyperparameter_tune_kwargs is specified
'num_trials': num_trials,
'scheduler' : 'local',
'searcher': search_strategy,
}
predictor_hpo_3d = TabularPredictor(
label=label,
eval_metric=metric
).fit(
train_data = train_split,
tuning_data = val_split,
time_limit=time_limit,
hyperparameters=hyperparameters,
hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
)
Fitted model: NeuralNetTorch/2a2b3c80 ...
-70.041 = Validation score (-root_mean_squared_error)
8.27s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch/2b04e552 ...
-88.7031 = Validation score (-root_mean_squared_error)
11.33s = Training runtime
0.05s = Validation runtime
Fitted model: NeuralNetTorch/32b7a46a ...
-112.1642 = Validation score (-root_mean_squared_error)
13.32s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch/3492d20a ...
-71.1961 = Validation score (-root_mean_squared_error)
6.98s = Training runtime
0.03s = Validation runtime
Fitted model: NeuralNetTorch/3ad324bc ...
-66.5796 = Validation score (-root_mean_squared_error)
8.53s = Training runtime
0.03s = Validation runtime
Fitted model: NeuralNetTorch/3b1c196a ...
-108.5733 = Validation score (-root_mean_squared_error)
8.13s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch/3b2a9648 ...
-109.926 = Validation score (-root_mean_squared_error)
10.45s = Training runtime
0.04s = Validation runtime
Fitted model: NeuralNetTorch/404a4fec ...
-113.3021 = Validation score (-root_mean_squared_error)
8.44s = Training runtime
0.03s = Validation runtime
Fitted model: NeuralNetTorch/45f2ca3c ...
-109.2155 = Validation score (-root_mean_squared_error)
8.24s = Training runtime
0.02s = Validation runtime
Fitted model: NeuralNetTorch/466117f8 ...
-59.1568 = Validation score (-root_mean_squared_error)
6.74s = Training runtime
0.03s = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 360.0s of the 395.43s of remaining time.
-33.3718 = Validation score (-root_mean_squared_error)
0.6s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 205.41s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_134722/")
predictor_hpo_3d.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L2 -33.371810 0.260009 101.341043 0.000615 0.600978 2 True 34
1 LightGBM/T10 -34.280612 0.096803 2.894914 0.096803 2.894914 1 True 10
2 LightGBM/T3 -34.387411 0.099663 2.678987 0.099663 2.678987 1 True 3
3 LightGBM/T2 -34.535660 0.026684 1.735699 0.026684 1.735699 1 True 2
4 LightGBM/T9 -34.556717 0.076220 1.812910 0.076220 1.812910 1 True 9
5 LightGBM/T5 -34.706954 0.075890 2.343083 0.075890 2.343083 1 True 5
6 LightGBM/T8 -34.947619 0.052375 1.886330 0.052375 1.886330 1 True 8
7 CatBoost/T3 -35.169444 0.011590 48.397956 0.011590 48.397956 1 True 23
8 LightGBM/T20 -35.205160 0.077559 2.119037 0.077559 2.119037 1 True 20
9 LightGBM/T17 -35.244784 0.082628 2.636436 0.082628 2.636436 1 True 17
10 LightGBM/T4 -35.269166 0.020271 1.778769 0.020271 1.778769 1 True 4
11 CatBoost/T2 -35.534501 0.024653 45.032510 0.024653 45.032510 1 True 22
12 LightGBM/T13 -35.573756 0.021226 1.020614 0.021226 1.020614 1 True 13
13 LightGBM/T7 -35.927587 0.023366 1.043127 0.023366 1.043127 1 True 7
14 LightGBM/T12 -35.978295 0.015040 0.970920 0.015040 0.970920 1 True 12
15 LightGBM/T18 -36.318749 0.076111 2.323512 0.076111 2.323512 1 True 18
16 CatBoost/T1 -36.373140 0.007612 9.286415 0.007612 9.286415 1 True 21
17 LightGBM/T19 -36.579369 0.045521 1.788286 0.045521 1.788286 1 True 19
18 LightGBM/T6 -39.451542 0.081108 2.292828 0.081108 2.292828 1 True 6
19 LightGBM/T1 -39.881060 0.011087 0.937789 0.011087 0.937789 1 True 1
20 LightGBM/T11 -41.444319 0.076936 2.522936 0.076936 2.522936 1 True 11
21 LightGBM/T15 -44.226885 0.016903 1.028713 0.016903 1.028713 1 True 15
22 NeuralNetTorch/466117f8 -59.156759 0.030447 6.742394 0.030447 6.742394 1 True 33
23 LightGBM/T16 -65.587603 0.020074 1.115781 0.020074 1.115781 1 True 16
24 NeuralNetTorch/3ad324bc -66.579595 0.028271 8.532125 0.028271 8.532125 1 True 28
25 LightGBM/T14 -68.403320 0.014486 1.021829 0.014486 1.021829 1 True 14
26 NeuralNetTorch/2a2b3c80 -70.041011 0.037631 8.266042 0.037631 8.266042 1 True 24
27 NeuralNetTorch/3492d20a -71.196096 0.032587 6.984771 0.032587 6.984771 1 True 27
28 NeuralNetTorch/2b04e552 -88.703094 0.049195 11.326621 0.049195 11.326621 1 True 25
29 NeuralNetTorch/3b1c196a -108.573278 0.040355 8.126105 0.040355 8.126105 1 True 29
30 NeuralNetTorch/45f2ca3c -109.215478 0.016575 8.238420 0.016575 8.238420 1 True 32
31 NeuralNetTorch/3b2a9648 -109.925967 0.041303 10.450331 0.041303 10.450331 1 True 30
32 NeuralNetTorch/32b7a46a -112.164166 0.036151 13.324193 0.036151 13.324193 1 True 26
33 NeuralNetTorch/404a4fec -113.302054 0.025529 8.443241 0.025529 8.443241 1 True 31
Number of models trained: 34
Types of models trained:
{'CatBoostModel', 'LGBModel', 'WeightedEnsembleModel', 'TabularNeuralNetTorchModel'}
Bagging used: False
Multi-layer stack-ensembling used: False
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_134722/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'LightGBM/T1': 'LGBModel',
'LightGBM/T2': 'LGBModel',
'LightGBM/T3': 'LGBModel',
'LightGBM/T4': 'LGBModel',
'LightGBM/T5': 'LGBModel',
'LightGBM/T6': 'LGBModel',
'LightGBM/T7': 'LGBModel',
'LightGBM/T8': 'LGBModel',
'LightGBM/T9': 'LGBModel',
'LightGBM/T10': 'LGBModel',
'LightGBM/T11': 'LGBModel',
'LightGBM/T12': 'LGBModel',
'LightGBM/T13': 'LGBModel',
'LightGBM/T14': 'LGBModel',
'LightGBM/T15': 'LGBModel',
'LightGBM/T16': 'LGBModel',
'LightGBM/T17': 'LGBModel',
'LightGBM/T18': 'LGBModel',
'LightGBM/T19': 'LGBModel',
'LightGBM/T20': 'LGBModel',
'CatBoost/T1': 'CatBoostModel',
'CatBoost/T2': 'CatBoostModel',
'CatBoost/T3': 'CatBoostModel',
'NeuralNetTorch/2a2b3c80': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/2b04e552': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/32b7a46a': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/3492d20a': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/3ad324bc': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/3b1c196a': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/3b2a9648': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/404a4fec': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/45f2ca3c': 'TabularNeuralNetTorchModel',
'NeuralNetTorch/466117f8': 'TabularNeuralNetTorchModel',
'WeightedEnsemble_L2': 'WeightedEnsembleModel'},
'model_performance': {'LightGBM/T1': -39.881060342635905,
'LightGBM/T2': -34.535659737832084,
'LightGBM/T3': -34.38741120590596,
'LightGBM/T4': -35.26916574495314,
'LightGBM/T5': -34.706954343465384,
'LightGBM/T6': -39.45154185293568,
'LightGBM/T7': -35.92758696111444,
'LightGBM/T8': -34.947618629023445,
'LightGBM/T9': -34.5567168392466,
'LightGBM/T10': -34.280611546537756,
'LightGBM/T11': -41.44431901188725,
'LightGBM/T12': -35.97829452569002,
'LightGBM/T13': -35.57375598774142,
'LightGBM/T14': -68.40331951460523,
'LightGBM/T15': -44.22688481900755,
'LightGBM/T16': -65.58760264680383,
'LightGBM/T17': -35.24478433829977,
'LightGBM/T18': -36.318748965765295,
'LightGBM/T19': -36.57936917600556,
'LightGBM/T20': -35.205160194962225,
'CatBoost/T1': -36.373139513240595,
'CatBoost/T2': -35.534500919656544,
'CatBoost/T3': -35.16944415939138,
'NeuralNetTorch/2a2b3c80': -70.04101089384326,
'NeuralNetTorch/2b04e552': -88.70309391318011,
'NeuralNetTorch/32b7a46a': -112.16416602981505,
'NeuralNetTorch/3492d20a': -71.19609561198388,
'NeuralNetTorch/3ad324bc': -66.57959533150182,
'NeuralNetTorch/3b1c196a': -108.57327782573907,
'NeuralNetTorch/3b2a9648': -109.92596679879752,
'NeuralNetTorch/404a4fec': -113.30205412326495,
'NeuralNetTorch/45f2ca3c': -109.21547788875446,
'NeuralNetTorch/466117f8': -59.15675948834796,
'WeightedEnsemble_L2': -33.3718101330817},
'model_best': 'WeightedEnsemble_L2',
'model_paths': {'LightGBM/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T1/',
'LightGBM/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T2/',
'LightGBM/T3': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T3/',
'LightGBM/T4': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T4/',
'LightGBM/T5': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T5/',
'LightGBM/T6': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T6/',
'LightGBM/T7': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T7/',
'LightGBM/T8': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T8/',
'LightGBM/T9': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T9/',
'LightGBM/T10': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T10/',
'LightGBM/T11': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T11/',
'LightGBM/T12': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T12/',
'LightGBM/T13': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T13/',
'LightGBM/T14': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T14/',
'LightGBM/T15': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T15/',
'LightGBM/T16': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T16/',
'LightGBM/T17': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T17/',
'LightGBM/T18': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T18/',
'LightGBM/T19': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T19/',
'LightGBM/T20': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/LightGBM/T20/',
'CatBoost/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/CatBoost/T1/',
'CatBoost/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/CatBoost/T2/',
'CatBoost/T3': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/CatBoost/T3/',
'NeuralNetTorch/2a2b3c80': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/2a2b3c80/',
'NeuralNetTorch/2b04e552': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/2b04e552/',
'NeuralNetTorch/32b7a46a': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/32b7a46a/',
'NeuralNetTorch/3492d20a': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/3492d20a/',
'NeuralNetTorch/3ad324bc': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/3ad324bc/',
'NeuralNetTorch/3b1c196a': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/3b1c196a/',
'NeuralNetTorch/3b2a9648': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/3b2a9648/',
'NeuralNetTorch/404a4fec': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/404a4fec/',
'NeuralNetTorch/45f2ca3c': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/45f2ca3c/',
'NeuralNetTorch/466117f8': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_134722/models/NeuralNetTorch/466117f8/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_134722/models/WeightedEnsemble_L2/'},
'model_fit_times': {'LightGBM/T1': 0.9377894401550293,
'LightGBM/T2': 1.7356986999511719,
'LightGBM/T3': 2.6789870262145996,
'LightGBM/T4': 1.77876877784729,
'LightGBM/T5': 2.3430826663970947,
'LightGBM/T6': 2.29282808303833,
'LightGBM/T7': 1.0431272983551025,
'LightGBM/T8': 1.8863298892974854,
'LightGBM/T9': 1.8129100799560547,
'LightGBM/T10': 2.894913673400879,
'LightGBM/T11': 2.5229361057281494,
'LightGBM/T12': 0.9709198474884033,
'LightGBM/T13': 1.020613670349121,
'LightGBM/T14': 1.0218286514282227,
'LightGBM/T15': 1.0287132263183594,
'LightGBM/T16': 1.1157805919647217,
'LightGBM/T17': 2.6364355087280273,
'LightGBM/T18': 2.323512315750122,
'LightGBM/T19': 1.788285732269287,
'LightGBM/T20': 2.119036912918091,
'CatBoost/T1': 9.286415338516235,
'CatBoost/T2': 45.03251028060913,
'CatBoost/T3': 48.397956132888794,
'NeuralNetTorch/2a2b3c80': 8.266042232513428,
'NeuralNetTorch/2b04e552': 11.32662057876587,
'NeuralNetTorch/32b7a46a': 13.324192523956299,
'NeuralNetTorch/3492d20a': 6.9847705364227295,
'NeuralNetTorch/3ad324bc': 8.532124519348145,
'NeuralNetTorch/3b1c196a': 8.126105308532715,
'NeuralNetTorch/3b2a9648': 10.450331449508667,
'NeuralNetTorch/404a4fec': 8.443240880966187,
'NeuralNetTorch/45f2ca3c': 8.238419532775879,
'NeuralNetTorch/466117f8': 6.742393732070923,
'WeightedEnsemble_L2': 0.6009776592254639},
'model_pred_times': {'LightGBM/T1': 0.011086702346801758,
'LightGBM/T2': 0.026684284210205078,
'LightGBM/T3': 0.09966349601745605,
'LightGBM/T4': 0.02027130126953125,
'LightGBM/T5': 0.07588958740234375,
'LightGBM/T6': 0.08110785484313965,
'LightGBM/T7': 0.02336573600769043,
'LightGBM/T8': 0.052375078201293945,
'LightGBM/T9': 0.07621955871582031,
'LightGBM/T10': 0.0968027114868164,
'LightGBM/T11': 0.0769355297088623,
'LightGBM/T12': 0.015039682388305664,
'LightGBM/T13': 0.021225690841674805,
'LightGBM/T14': 0.014485597610473633,
'LightGBM/T15': 0.016902685165405273,
'LightGBM/T16': 0.02007436752319336,
'LightGBM/T17': 0.08262777328491211,
'LightGBM/T18': 0.07611083984375,
'LightGBM/T19': 0.04552054405212402,
'LightGBM/T20': 0.07755923271179199,
'CatBoost/T1': 0.007611751556396484,
'CatBoost/T2': 0.024653196334838867,
'CatBoost/T3': 0.011589765548706055,
'NeuralNetTorch/2a2b3c80': 0.03763079643249512,
'NeuralNetTorch/2b04e552': 0.0491948127746582,
'NeuralNetTorch/32b7a46a': 0.03615093231201172,
'NeuralNetTorch/3492d20a': 0.032587289810180664,
'NeuralNetTorch/3ad324bc': 0.028271198272705078,
'NeuralNetTorch/3b1c196a': 0.04035520553588867,
'NeuralNetTorch/3b2a9648': 0.04130268096923828,
'NeuralNetTorch/404a4fec': 0.02552938461303711,
'NeuralNetTorch/45f2ca3c': 0.016575336456298828,
'NeuralNetTorch/466117f8': 0.030446529388427734,
'WeightedEnsemble_L2': 0.0006151199340820312},
'num_bag_folds': 0,
'max_stack_level': 2,
'model_hyperparams': {'LightGBM/T1': {'learning_rate': 0.05,
'num_boost_round': 100,
'num_leaves': 36,
'feature_fraction': 1.0,
'min_data_in_leaf': 20},
'LightGBM/T2': {'learning_rate': 0.06994332504138305,
'num_boost_round': 863,
'num_leaves': 29,
'feature_fraction': 0.8872033759818312,
'min_data_in_leaf': 5},
'LightGBM/T3': {'learning_rate': 0.049883446878335284,
'num_boost_round': 904,
'num_leaves': 49,
'feature_fraction': 0.9618129346960314,
'min_data_in_leaf': 52},
'LightGBM/T4': {'learning_rate': 0.17491036983395955,
'num_boost_round': 805,
'num_leaves': 64,
'feature_fraction': 0.97294325019552,
'min_data_in_leaf': 60},
'LightGBM/T5': {'learning_rate': 0.029371236410262794,
'num_boost_round': 777,
'num_leaves': 51,
'feature_fraction': 0.9530421821938733,
'min_data_in_leaf': 19},
'LightGBM/T6': {'learning_rate': 0.006895350858996428,
'num_boost_round': 855,
'num_leaves': 31,
'feature_fraction': 0.7677590145494717,
'min_data_in_leaf': 53},
'LightGBM/T7': {'learning_rate': 0.12381739396810261,
'num_boost_round': 199,
'num_leaves': 50,
'feature_fraction': 0.9445391877374626,
'min_data_in_leaf': 20},
'LightGBM/T8': {'learning_rate': 0.03410406526204423,
'num_boost_round': 523,
'num_leaves': 58,
'feature_fraction': 0.9502276879949111,
'min_data_in_leaf': 21},
'LightGBM/T9': {'learning_rate': 0.036297286185258806,
'num_boost_round': 643,
'num_leaves': 36,
'feature_fraction': 0.8955049480187768,
'min_data_in_leaf': 34},
'LightGBM/T10': {'learning_rate': 0.01326795426943024,
'num_boost_round': 982,
'num_leaves': 54,
'feature_fraction': 0.8536654849976308,
'min_data_in_leaf': 13},
'LightGBM/T11': {'learning_rate': 0.005358859760545354,
'num_boost_round': 650,
'num_leaves': 66,
'feature_fraction': 0.8921084872171621,
'min_data_in_leaf': 55},
'LightGBM/T12': {'learning_rate': 0.16252158262855104,
'num_boost_round': 142,
'num_leaves': 57,
'feature_fraction': 0.9042334992186892,
'min_data_in_leaf': 43},
'LightGBM/T13': {'learning_rate': 0.06555669217258556,
'num_boost_round': 157,
'num_leaves': 61,
'feature_fraction': 0.8592579884498354,
'min_data_in_leaf': 43},
'LightGBM/T14': {'learning_rate': 0.009392418014048174,
'num_boost_round': 191,
'num_leaves': 26,
'feature_fraction': 0.9132850089494844,
'min_data_in_leaf': 20},
'LightGBM/T15': {'learning_rate': 0.01912742372887113,
'num_boost_round': 184,
'num_leaves': 37,
'feature_fraction': 0.8288570877310459,
'min_data_in_leaf': 59},
'LightGBM/T16': {'learning_rate': 0.007285375030920856,
'num_boost_round': 231,
'num_leaves': 38,
'feature_fraction': 0.9970934595148065,
'min_data_in_leaf': 54},
'LightGBM/T17': {'learning_rate': 0.012727946668610704,
'num_boost_round': 760,
'num_leaves': 61,
'feature_fraction': 0.9132770813663496,
'min_data_in_leaf': 17},
'LightGBM/T18': {'learning_rate': 0.017397164934517897,
'num_boost_round': 953,
'num_leaves': 31,
'feature_fraction': 0.905877525282967,
'min_data_in_leaf': 55},
'LightGBM/T19': {'learning_rate': 0.019484526128795285,
'num_boost_round': 706,
'num_leaves': 26,
'feature_fraction': 0.7991455904200133,
'min_data_in_leaf': 7},
'LightGBM/T20': {'learning_rate': 0.060008551897487356,
'num_boost_round': 961,
'num_leaves': 29,
'feature_fraction': 0.9057115238866673,
'min_data_in_leaf': 50},
'CatBoost/T1': {'iterations': 1000,
'learning_rate': 0.05,
'random_seed': 0,
'allow_writing_files': False,
'eval_metric': 'RMSE',
'depth': 6,
'l2_leaf_reg': 3},
'CatBoost/T2': {'iterations': 4264,
'learning_rate': 0.03731689940050502,
'random_seed': 0,
'allow_writing_files': False,
'eval_metric': 'RMSE',
'depth': 5,
'l2_leaf_reg': 3.4110535042865755},
'CatBoost/T3': {'iterations': 8891,
'learning_rate': 0.025119498706364835,
'random_seed': 0,
'allow_writing_files': False,
'eval_metric': 'RMSE',
'depth': 6,
'l2_leaf_reg': 3.5835764522666245},
'NeuralNetTorch/2a2b3c80': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'relu',
'embedding_size_factor': 1.0,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.1,
'optimizer': 'adam',
'learning_rate': 0.0005,
'weight_decay': 1e-06,
'proc.embed_min_categories': 4,
'proc.impute_strategy': 'median',
'proc.max_category_levels': 100,
'proc.skew_threshold': 0.99,
'use_ngram_features': False,
'num_layers': 2,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': False,
'loss_function': 'auto'},
'NeuralNetTorch/2b04e552': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'tanh',
'embedding_size_factor': 0.7,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.28602117328419646,
'optimizer': 'adam',
'learning_rate': 0.00038841662547517604,
'weight_decay': 9.525820882809338e-06,
'proc.embed_min_categories': 3,
'proc.impute_strategy': 'most_frequent',
'proc.max_category_levels': 20,
'proc.skew_threshold': 0.999,
'use_ngram_features': False,
'num_layers': 4,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': False,
'loss_function': 'auto'},
'NeuralNetTorch/32b7a46a': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'softrelu',
'embedding_size_factor': 1.3,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.07948083953522483,
'optimizer': 'adam',
'learning_rate': 0.000272264324493386,
'weight_decay': 3.979314106188912e-06,
'proc.embed_min_categories': 3,
'proc.impute_strategy': 'most_frequent',
'proc.max_category_levels': 400,
'proc.skew_threshold': 10.0,
'use_ngram_features': False,
'num_layers': 3,
'hidden_size': 256,
'max_batch_size': 512,
'use_batchnorm': True,
'loss_function': 'auto'},
'NeuralNetTorch/3492d20a': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'tanh',
'embedding_size_factor': 1.0,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.14873741418520353,
'optimizer': 'adam',
'learning_rate': 0.0010150251504637965,
'weight_decay': 1.6509399760982143e-09,
'proc.embed_min_categories': 1000,
'proc.impute_strategy': 'mean',
'proc.max_category_levels': 20,
'proc.skew_threshold': 0.9,
'use_ngram_features': False,
'num_layers': 3,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': False,
'loss_function': 'auto'},
'NeuralNetTorch/3ad324bc': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'tanh',
'embedding_size_factor': 1.1,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.28242005569078843,
'optimizer': 'adam',
'learning_rate': 0.003167329225609986,
'weight_decay': 0.0048194893513830404,
'proc.embed_min_categories': 10,
'proc.impute_strategy': 'median',
'proc.max_category_levels': 20,
'proc.skew_threshold': 100.0,
'use_ngram_features': False,
'num_layers': 2,
'hidden_size': 256,
'max_batch_size': 512,
'use_batchnorm': True,
'loss_function': 'auto'},
'NeuralNetTorch/3b1c196a': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'softrelu',
'embedding_size_factor': 0.8,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.10207705026241903,
'optimizer': 'adam',
'learning_rate': 0.004107238794222514,
'weight_decay': 1.1692398360337485e-05,
'proc.embed_min_categories': 3,
'proc.impute_strategy': 'mean',
'proc.max_category_levels': 10,
'proc.skew_threshold': 0.2,
'use_ngram_features': False,
'num_layers': 3,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': False,
'loss_function': 'auto'},
'NeuralNetTorch/3b2a9648': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'softrelu',
'embedding_size_factor': 0.5,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.2988844223107875,
'optimizer': 'adam',
'learning_rate': 0.0004495274409467887,
'weight_decay': 8.272927840622487e-06,
'proc.embed_min_categories': 3,
'proc.impute_strategy': 'mean',
'proc.max_category_levels': 400,
'proc.skew_threshold': 100.0,
'use_ngram_features': False,
'num_layers': 3,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': True,
'loss_function': 'auto'},
'NeuralNetTorch/404a4fec': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'softrelu',
'embedding_size_factor': 1.1,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.3232742250750223,
'optimizer': 'adam',
'learning_rate': 0.0001230167695529213,
'weight_decay': 1.4510603080143505e-09,
'proc.embed_min_categories': 4,
'proc.impute_strategy': 'mean',
'proc.max_category_levels': 400,
'proc.skew_threshold': 0.99,
'use_ngram_features': False,
'num_layers': 2,
'hidden_size': 256,
'max_batch_size': 512,
'use_batchnorm': False,
'loss_function': 'auto'},
'NeuralNetTorch/45f2ca3c': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'softrelu',
'embedding_size_factor': 1.4,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.3741605320005933,
'optimizer': 'adam',
'learning_rate': 0.0002918852964847386,
'weight_decay': 2.482496442931436e-07,
'proc.embed_min_categories': 4,
'proc.impute_strategy': 'mean',
'proc.max_category_levels': 10,
'proc.skew_threshold': 0.99,
'use_ngram_features': False,
'num_layers': 2,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': True,
'loss_function': 'auto'},
'NeuralNetTorch/466117f8': {'num_epochs': 10,
'epochs_wo_improve': 20,
'activation': 'relu',
'embedding_size_factor': 0.5,
'embed_exponent': 0.56,
'max_embedding_dim': 100,
'y_range': None,
'y_range_extend': 0.05,
'dropout_prob': 0.1680996838225105,
'optimizer': 'adam',
'learning_rate': 0.004200555590858911,
'weight_decay': 8.430171875185889e-07,
'proc.embed_min_categories': 1000,
'proc.impute_strategy': 'median',
'proc.max_category_levels': 200,
'proc.skew_threshold': 0.8,
'use_ngram_features': False,
'num_layers': 2,
'hidden_size': 128,
'max_batch_size': 512,
'use_batchnorm': True,
'loss_function': 'auto'},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L2 -33.371810 0.260009 101.341043
1 LightGBM/T10 -34.280612 0.096803 2.894914
2 LightGBM/T3 -34.387411 0.099663 2.678987
3 LightGBM/T2 -34.535660 0.026684 1.735699
4 LightGBM/T9 -34.556717 0.076220 1.812910
5 LightGBM/T5 -34.706954 0.075890 2.343083
6 LightGBM/T8 -34.947619 0.052375 1.886330
7 CatBoost/T3 -35.169444 0.011590 48.397956
8 LightGBM/T20 -35.205160 0.077559 2.119037
9 LightGBM/T17 -35.244784 0.082628 2.636436
10 LightGBM/T4 -35.269166 0.020271 1.778769
11 CatBoost/T2 -35.534501 0.024653 45.032510
12 LightGBM/T13 -35.573756 0.021226 1.020614
13 LightGBM/T7 -35.927587 0.023366 1.043127
14 LightGBM/T12 -35.978295 0.015040 0.970920
15 LightGBM/T18 -36.318749 0.076111 2.323512
16 CatBoost/T1 -36.373140 0.007612 9.286415
17 LightGBM/T19 -36.579369 0.045521 1.788286
18 LightGBM/T6 -39.451542 0.081108 2.292828
19 LightGBM/T1 -39.881060 0.011087 0.937789
20 LightGBM/T11 -41.444319 0.076936 2.522936
21 LightGBM/T15 -44.226885 0.016903 1.028713
22 NeuralNetTorch/466117f8 -59.156759 0.030447 6.742394
23 LightGBM/T16 -65.587603 0.020074 1.115781
24 NeuralNetTorch/3ad324bc -66.579595 0.028271 8.532125
25 LightGBM/T14 -68.403320 0.014486 1.021829
26 NeuralNetTorch/2a2b3c80 -70.041011 0.037631 8.266042
27 NeuralNetTorch/3492d20a -71.196096 0.032587 6.984771
28 NeuralNetTorch/2b04e552 -88.703094 0.049195 11.326621
29 NeuralNetTorch/3b1c196a -108.573278 0.040355 8.126105
30 NeuralNetTorch/45f2ca3c -109.215478 0.016575 8.238420
31 NeuralNetTorch/3b2a9648 -109.925967 0.041303 10.450331
32 NeuralNetTorch/32b7a46a -112.164166 0.036151 13.324193
33 NeuralNetTorch/404a4fec -113.302054 0.025529 8.443241
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000615 0.600978 2 True
1 0.096803 2.894914 1 True
2 0.099663 2.678987 1 True
3 0.026684 1.735699 1 True
4 0.076220 1.812910 1 True
5 0.075890 2.343083 1 True
6 0.052375 1.886330 1 True
7 0.011590 48.397956 1 True
8 0.077559 2.119037 1 True
9 0.082628 2.636436 1 True
10 0.020271 1.778769 1 True
11 0.024653 45.032510 1 True
12 0.021226 1.020614 1 True
13 0.023366 1.043127 1 True
14 0.015040 0.970920 1 True
15 0.076111 2.323512 1 True
16 0.007612 9.286415 1 True
17 0.045521 1.788286 1 True
18 0.081108 2.292828 1 True
19 0.011087 0.937789 1 True
20 0.076936 2.522936 1 True
21 0.016903 1.028713 1 True
22 0.030447 6.742394 1 True
23 0.020074 1.115781 1 True
24 0.028271 8.532125 1 True
25 0.014486 1.021829 1 True
26 0.037631 8.266042 1 True
27 0.032587 6.984771 1 True
28 0.049195 11.326621 1 True
29 0.040355 8.126105 1 True
30 0.016575 8.238420 1 True
31 0.041303 10.450331 1 True
32 0.036151 13.324193 1 True
33 0.025529 8.443241 1 True
fit_order
0 34
1 10
2 3
3 2
4 9
5 5
6 8
7 23
8 20
9 17
10 4
11 22
12 13
13 7
14 12
15 18
16 21
17 19
18 6
19 1
20 11
21 15
22 33
23 16
24 28
25 14
26 24
27 27
28 25
29 29
30 32
31 30
32 26
33 31 }
predictor_hpo_3d.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L2 | -33.371810 | 0.260009 | 101.341043 | 0.000615 | 0.600978 | 2 | True | 34 |
| 1 | LightGBM/T10 | -34.280612 | 0.096803 | 2.894914 | 0.096803 | 2.894914 | 1 | True | 10 |
| 2 | LightGBM/T3 | -34.387411 | 0.099663 | 2.678987 | 0.099663 | 2.678987 | 1 | True | 3 |
| 3 | LightGBM/T2 | -34.535660 | 0.026684 | 1.735699 | 0.026684 | 1.735699 | 1 | True | 2 |
| 4 | LightGBM/T9 | -34.556717 | 0.076220 | 1.812910 | 0.076220 | 1.812910 | 1 | True | 9 |
| 5 | LightGBM/T5 | -34.706954 | 0.075890 | 2.343083 | 0.075890 | 2.343083 | 1 | True | 5 |
| 6 | LightGBM/T8 | -34.947619 | 0.052375 | 1.886330 | 0.052375 | 1.886330 | 1 | True | 8 |
| 7 | CatBoost/T3 | -35.169444 | 0.011590 | 48.397956 | 0.011590 | 48.397956 | 1 | True | 23 |
| 8 | LightGBM/T20 | -35.205160 | 0.077559 | 2.119037 | 0.077559 | 2.119037 | 1 | True | 20 |
| 9 | LightGBM/T17 | -35.244784 | 0.082628 | 2.636436 | 0.082628 | 2.636436 | 1 | True | 17 |
| 10 | LightGBM/T4 | -35.269166 | 0.020271 | 1.778769 | 0.020271 | 1.778769 | 1 | True | 4 |
| 11 | CatBoost/T2 | -35.534501 | 0.024653 | 45.032510 | 0.024653 | 45.032510 | 1 | True | 22 |
| 12 | LightGBM/T13 | -35.573756 | 0.021226 | 1.020614 | 0.021226 | 1.020614 | 1 | True | 13 |
| 13 | LightGBM/T7 | -35.927587 | 0.023366 | 1.043127 | 0.023366 | 1.043127 | 1 | True | 7 |
| 14 | LightGBM/T12 | -35.978295 | 0.015040 | 0.970920 | 0.015040 | 0.970920 | 1 | True | 12 |
| 15 | LightGBM/T18 | -36.318749 | 0.076111 | 2.323512 | 0.076111 | 2.323512 | 1 | True | 18 |
| 16 | CatBoost/T1 | -36.373140 | 0.007612 | 9.286415 | 0.007612 | 9.286415 | 1 | True | 21 |
| 17 | LightGBM/T19 | -36.579369 | 0.045521 | 1.788286 | 0.045521 | 1.788286 | 1 | True | 19 |
| 18 | LightGBM/T6 | -39.451542 | 0.081108 | 2.292828 | 0.081108 | 2.292828 | 1 | True | 6 |
| 19 | LightGBM/T1 | -39.881060 | 0.011087 | 0.937789 | 0.011087 | 0.937789 | 1 | True | 1 |
| 20 | LightGBM/T11 | -41.444319 | 0.076936 | 2.522936 | 0.076936 | 2.522936 | 1 | True | 11 |
| 21 | LightGBM/T15 | -44.226885 | 0.016903 | 1.028713 | 0.016903 | 1.028713 | 1 | True | 15 |
| 22 | NeuralNetTorch/466117f8 | -59.156759 | 0.030447 | 6.742394 | 0.030447 | 6.742394 | 1 | True | 33 |
| 23 | LightGBM/T16 | -65.587603 | 0.020074 | 1.115781 | 0.020074 | 1.115781 | 1 | True | 16 |
| 24 | NeuralNetTorch/3ad324bc | -66.579595 | 0.028271 | 8.532125 | 0.028271 | 8.532125 | 1 | True | 28 |
| 25 | LightGBM/T14 | -68.403320 | 0.014486 | 1.021829 | 0.014486 | 1.021829 | 1 | True | 14 |
| 26 | NeuralNetTorch/2a2b3c80 | -70.041011 | 0.037631 | 8.266042 | 0.037631 | 8.266042 | 1 | True | 24 |
| 27 | NeuralNetTorch/3492d20a | -71.196096 | 0.032587 | 6.984771 | 0.032587 | 6.984771 | 1 | True | 27 |
| 28 | NeuralNetTorch/2b04e552 | -88.703094 | 0.049195 | 11.326621 | 0.049195 | 11.326621 | 1 | True | 25 |
| 29 | NeuralNetTorch/3b1c196a | -108.573278 | 0.040355 | 8.126105 | 0.040355 | 8.126105 | 1 | True | 29 |
| 30 | NeuralNetTorch/45f2ca3c | -109.215478 | 0.016575 | 8.238420 | 0.016575 | 8.238420 | 1 | True | 32 |
| 31 | NeuralNetTorch/3b2a9648 | -109.925967 | 0.041303 | 10.450331 | 0.041303 | 10.450331 | 1 | True | 30 |
| 32 | NeuralNetTorch/32b7a46a | -112.164166 | 0.036151 | 13.324193 | 0.036151 | 13.324193 | 1 | True | 26 |
| 33 | NeuralNetTorch/404a4fec | -113.302054 | 0.025529 | 8.443241 | 0.025529 | 8.443241 | 1 | True | 31 |
fig = predictor_hpo_3d.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_3d_leaderboard.png')
# Remember to set all negative values to zero
predictions_hpo_3d = predictor_hpo_3d.predict(test)
predictions_hpo_3d.describe()
count 6493.000000 mean 191.607224 std 173.620392 min -17.750841 25% 46.533169 50% 149.707977 75% 284.303650 max 900.376465 Name: count, dtype: float64
(predictions_hpo_3d<0).sum()
89
predictions_hpo_3d = predictions_hpo_3d.apply(lambda x: 0 if x<0 else x)
predictions_hpo_3d.describe()
count 6493.000000 mean 191.643381 std 173.579907 min 0.000000 25% 46.533169 50% 149.707977 75% 284.303650 max 900.376465 Name: count, dtype: float64
submission_new_hpo_3d = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_hpo_3d["count"] = predictions_hpo_3d
submission_new_hpo_3d.to_csv("submission_new_hpo_3d.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo_3d.csv -m "hp tuning 3d"
100%|█████████████████████████████████████████| 241k/241k [00:00<00:00, 527kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 6
fileName date description status publicScore privateScore ------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------ submission_new_hpo_3d.csv 2022-12-26 13:50:52 hp tuning 3d complete 0.52253 0.52253 submission_new_hpo_3c.csv 2022-12-26 13:46:38 hpo 3c num_bag_sets = 5 complete 0.62247 0.62247 submission_new_hpo_3b.csv 2022-12-26 13:35:38 hpo 3b num_bag_folds = 10 complete 0.63100 0.63100 submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835 tail: error writing 'standard output': Broken pipe
import autogluon.core as ag
nn_options = {
'num_epochs': 10,
'learning_rate': ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
'activation': ag.space.Categorical('relu', 'softrelu', 'tanh'),
'dropout_prob': ag.space.Real(0.0, 0.5, default=0.1),
}
gbm_options = {
'num_boost_round': ag.space.Int(lower = 100, upper = 1000),
'num_leaves': ag.space.Int(lower=26, upper=66, default=36),
}
rt_options = {
'n_estimators': ag.space.Int(lower =150,upper=500)
}
xt_options = {
'n_estimators': ag.space.Int(lower =150,upper=500)
}
cat_options = {
'iterations': ag.space.Int(lower =1000,upper=10000)
}
hyperparameters = { # hyperparameters of each model type
'GBM': gbm_options,
'NN_TORCH': nn_options, # NOTE: comment this line out if you get errors on Mac OSX
'RF': rt_options,
'XT': xt_options,
'CAT': cat_options,
} # When these keys are missing from hyperparameters dict, no models of that type are trained
time_limit = 10*60 # train various models for ~2 min
num_trials = 20 # try at most 5 different hyperparameter configurations for each type of model
search_strategy = 'bayes' # to tune hyperparameters using SKopt Bayesian optimization routine
label = 'count'
metric = 'root_mean_squared_error'
hyperparameter_tune_kwargs = { # HPO is not performed unless hyperparameter_tune_kwargs is specified
'num_trials': num_trials,
'scheduler' : 'local',
'searcher': search_strategy,
}
predictor_hpo_3e = TabularPredictor(
label=label,
eval_metric=metric
).fit(
train_data = train,
auto_stack=True,
num_stack_levels=1,
num_bag_folds=8,
num_bag_sets=5,
time_limit=time_limit,
hyperparameters=hyperparameters,
hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
)
No model was trained during hyperparameter tuning NeuralNetTorch_BAG_L2... Skipping this model.
Completed 1/5 k-fold bagging repeats ...
Fitting model: WeightedEnsemble_L3 ... Training model for up to 360.0s of the 74.96s of remaining time.
-33.158 = Validation score (-root_mean_squared_error)
0.38s = Training runtime
0.0s = Validation runtime
AutoGluon training complete, total runtime = 525.63s ... Best model: "WeightedEnsemble_L2"
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20221226_135054/")
predictor_hpo_3e.fit_summary()
*** Summary of fit() ***
Estimated performance of each model:
model score_val pred_time_val fit_time pred_time_val_marginal fit_time_marginal stack_level can_infer fit_order
0 WeightedEnsemble_L2 -33.038009 0.001529 135.658681 0.000880 0.846915 2 True 15
1 WeightedEnsemble_L3 -33.158029 0.003626 385.054592 0.000834 0.378233 3 True 22
2 LightGBM_BAG_L2/T1 -33.484871 0.002360 292.803847 0.000132 25.689138 2 True 16
3 ExtraTrees_BAG_L2/T2 -33.565096 0.002403 283.678812 0.000175 16.564104 2 True 20
4 LightGBM_BAG_L1/T2 -33.609296 0.000131 33.872199 0.000131 33.872199 1 True 2
5 ExtraTrees_BAG_L2/T3 -33.730641 0.002363 277.433658 0.000135 10.318949 2 True 21
6 ExtraTrees_BAG_L2/T1 -33.756812 0.002407 275.210723 0.000179 8.096014 2 True 19
7 CatBoost_BAG_L2/T1 -33.821450 0.002318 317.492665 0.000090 50.377956 2 True 18
8 RandomForest_BAG_L2/T1 -33.991969 0.002394 292.045161 0.000166 24.930453 2 True 17
9 CatBoost_BAG_L1/T1 -36.826482 0.000135 71.848728 0.000135 71.848728 1 True 7
10 ExtraTrees_BAG_L1/T6 -37.282208 0.000216 16.457945 0.000216 16.457945 1 True 13
11 ExtraTrees_BAG_L1/T7 -37.387917 0.000124 14.057925 0.000124 14.057925 1 True 14
12 ExtraTrees_BAG_L1/T5 -37.424838 0.000138 11.915664 0.000138 11.915664 1 True 12
13 ExtraTrees_BAG_L1/T2 -37.427425 0.000157 10.959226 0.000157 10.959226 1 True 9
14 ExtraTrees_BAG_L1/T4 -37.506339 0.000182 9.332303 0.000182 9.332303 1 True 11
15 ExtraTrees_BAG_L1/T3 -37.648697 0.000166 6.704234 0.000166 6.704234 1 True 10
16 ExtraTrees_BAG_L1/T1 -37.917176 0.000160 5.374717 0.000160 5.374717 1 True 8
17 RandomForest_BAG_L1/T2 -38.254801 0.000247 21.524785 0.000247 21.524785 1 True 4
18 RandomForest_BAG_L1/T4 -38.329116 0.000154 17.215766 0.000154 17.215766 1 True 6
19 RandomForest_BAG_L1/T3 -38.341621 0.000166 12.632894 0.000166 12.632894 1 True 5
20 RandomForest_BAG_L1/T1 -38.490504 0.000159 9.875644 0.000159 9.875644 1 True 3
21 LightGBM_BAG_L1/T1 -39.654394 0.000092 25.342679 0.000092 25.342679 1 True 1
Number of models trained: 22
Types of models trained:
{'WeightedEnsembleModel', 'StackerEnsembleModel_LGB', 'StackerEnsembleModel_XT', 'StackerEnsembleModel_RF', 'StackerEnsembleModel_CatBoost'}
Bagging used: True (with 8 folds)
Multi-layer stack-ensembling used: True (with 3 levels)
Feature Metadata (Processed):
(raw dtype, special dtypes):
('category', []) : 3 | ['season', 'weather', 'hour_category']
('float', []) : 3 | ['temp', 'atemp', 'windspeed']
('int', []) : 6 | ['humidity', 'datetime_hour', 'datetime_day', 'datetime_month', 'datetime_dayofweek', ...]
('int', ['bool']) : 3 | ['holiday', 'workingday', 'datetime_year']
('int', ['datetime_as_int']) : 5 | ['datetime', 'datetime.year', 'datetime.month', 'datetime.day', 'datetime.dayofweek']
Plot summary of models saved to file: AutogluonModels/ag-20221226_135054/SummaryOfModels.html
*** End of fit() summary ***
{'model_types': {'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
'RandomForest_BAG_L1/T1': 'StackerEnsembleModel_RF',
'RandomForest_BAG_L1/T2': 'StackerEnsembleModel_RF',
'RandomForest_BAG_L1/T3': 'StackerEnsembleModel_RF',
'RandomForest_BAG_L1/T4': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L1/T1': 'StackerEnsembleModel_CatBoost',
'ExtraTrees_BAG_L1/T1': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T2': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T3': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T4': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T5': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T6': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L1/T7': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L2': 'WeightedEnsembleModel',
'LightGBM_BAG_L2/T1': 'StackerEnsembleModel_LGB',
'RandomForest_BAG_L2/T1': 'StackerEnsembleModel_RF',
'CatBoost_BAG_L2/T1': 'StackerEnsembleModel_CatBoost',
'ExtraTrees_BAG_L2/T1': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L2/T2': 'StackerEnsembleModel_XT',
'ExtraTrees_BAG_L2/T3': 'StackerEnsembleModel_XT',
'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
'model_performance': {'LightGBM_BAG_L1/T1': -39.65439439248688,
'LightGBM_BAG_L1/T2': -33.609296000838235,
'RandomForest_BAG_L1/T1': -38.49050354468141,
'RandomForest_BAG_L1/T2': -38.254801437961284,
'RandomForest_BAG_L1/T3': -38.3416208647323,
'RandomForest_BAG_L1/T4': -38.329115887810964,
'CatBoost_BAG_L1/T1': -36.82648237440223,
'ExtraTrees_BAG_L1/T1': -37.91717566332131,
'ExtraTrees_BAG_L1/T2': -37.427425158772785,
'ExtraTrees_BAG_L1/T3': -37.648697000010955,
'ExtraTrees_BAG_L1/T4': -37.50633946452237,
'ExtraTrees_BAG_L1/T5': -37.424837832670775,
'ExtraTrees_BAG_L1/T6': -37.28220801334453,
'ExtraTrees_BAG_L1/T7': -37.38791710160737,
'WeightedEnsemble_L2': -33.038008809932116,
'LightGBM_BAG_L2/T1': -33.484870712032716,
'RandomForest_BAG_L2/T1': -33.99196891368665,
'CatBoost_BAG_L2/T1': -33.82145035992648,
'ExtraTrees_BAG_L2/T1': -33.756811756093086,
'ExtraTrees_BAG_L2/T2': -33.56509633349596,
'ExtraTrees_BAG_L2/T3': -33.7306409117612,
'WeightedEnsemble_L3': -33.15802896825262},
'model_best': 'WeightedEnsemble_L2',
'model_paths': {'LightGBM_BAG_L1/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/LightGBM_BAG_L1/T1/',
'LightGBM_BAG_L1/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/LightGBM_BAG_L1/T2/',
'RandomForest_BAG_L1/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/RandomForest_BAG_L1/T1/',
'RandomForest_BAG_L1/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/RandomForest_BAG_L1/T2/',
'RandomForest_BAG_L1/T3': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/RandomForest_BAG_L1/T3/',
'RandomForest_BAG_L1/T4': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/RandomForest_BAG_L1/T4/',
'CatBoost_BAG_L1/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/CatBoost_BAG_L1/T1/',
'ExtraTrees_BAG_L1/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T1/',
'ExtraTrees_BAG_L1/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T2/',
'ExtraTrees_BAG_L1/T3': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T3/',
'ExtraTrees_BAG_L1/T4': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T4/',
'ExtraTrees_BAG_L1/T5': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T5/',
'ExtraTrees_BAG_L1/T6': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T6/',
'ExtraTrees_BAG_L1/T7': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L1/T7/',
'WeightedEnsemble_L2': 'AutogluonModels/ag-20221226_135054/models/WeightedEnsemble_L2/',
'LightGBM_BAG_L2/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/LightGBM_BAG_L2/T1/',
'RandomForest_BAG_L2/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/RandomForest_BAG_L2/T1/',
'CatBoost_BAG_L2/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/CatBoost_BAG_L2/T1/',
'ExtraTrees_BAG_L2/T1': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L2/T1/',
'ExtraTrees_BAG_L2/T2': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L2/T2/',
'ExtraTrees_BAG_L2/T3': '/root/aws_mle_nanodegree/project_1/AutogluonModels/ag-20221226_135054/models/ExtraTrees_BAG_L2/T3/',
'WeightedEnsemble_L3': 'AutogluonModels/ag-20221226_135054/models/WeightedEnsemble_L3/'},
'model_fit_times': {'LightGBM_BAG_L1/T1': 25.342678785324097,
'LightGBM_BAG_L1/T2': 33.872198820114136,
'RandomForest_BAG_L1/T1': 9.875643730163574,
'RandomForest_BAG_L1/T2': 21.52478528022766,
'RandomForest_BAG_L1/T3': 12.632893800735474,
'RandomForest_BAG_L1/T4': 17.215765953063965,
'CatBoost_BAG_L1/T1': 71.84872794151306,
'ExtraTrees_BAG_L1/T1': 5.3747169971466064,
'ExtraTrees_BAG_L1/T2': 10.95922589302063,
'ExtraTrees_BAG_L1/T3': 6.704233884811401,
'ExtraTrees_BAG_L1/T4': 9.332302808761597,
'ExtraTrees_BAG_L1/T5': 11.915664434432983,
'ExtraTrees_BAG_L1/T6': 16.457945346832275,
'ExtraTrees_BAG_L1/T7': 14.05792498588562,
'WeightedEnsemble_L2': 0.8469147682189941,
'LightGBM_BAG_L2/T1': 25.689138412475586,
'RandomForest_BAG_L2/T1': 24.930452585220337,
'CatBoost_BAG_L2/T1': 50.37795615196228,
'ExtraTrees_BAG_L2/T1': 8.096014261245728,
'ExtraTrees_BAG_L2/T2': 16.564103603363037,
'ExtraTrees_BAG_L2/T3': 10.318948984146118,
'WeightedEnsemble_L3': 0.3782329559326172},
'model_pred_times': {'LightGBM_BAG_L1/T1': 9.226799011230469e-05,
'LightGBM_BAG_L1/T2': 0.0001308917999267578,
'RandomForest_BAG_L1/T1': 0.0001590251922607422,
'RandomForest_BAG_L1/T2': 0.0002465248107910156,
'RandomForest_BAG_L1/T3': 0.00016641616821289062,
'RandomForest_BAG_L1/T4': 0.00015354156494140625,
'CatBoost_BAG_L1/T1': 0.0001354217529296875,
'ExtraTrees_BAG_L1/T1': 0.00015997886657714844,
'ExtraTrees_BAG_L1/T2': 0.00015735626220703125,
'ExtraTrees_BAG_L1/T3': 0.00016617774963378906,
'ExtraTrees_BAG_L1/T4': 0.0001819133758544922,
'ExtraTrees_BAG_L1/T5': 0.0001380443572998047,
'ExtraTrees_BAG_L1/T6': 0.0002155303955078125,
'ExtraTrees_BAG_L1/T7': 0.00012445449829101562,
'WeightedEnsemble_L2': 0.0008804798126220703,
'LightGBM_BAG_L2/T1': 0.0001323223114013672,
'RandomForest_BAG_L2/T1': 0.00016617774963378906,
'CatBoost_BAG_L2/T1': 9.036064147949219e-05,
'ExtraTrees_BAG_L2/T1': 0.000179290771484375,
'ExtraTrees_BAG_L2/T2': 0.00017547607421875,
'ExtraTrees_BAG_L2/T3': 0.0001354217529296875,
'WeightedEnsemble_L3': 0.0008342266082763672},
'num_bag_folds': 8,
'max_stack_level': 3,
'model_hyperparams': {'LightGBM_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForest_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'RandomForest_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'RandomForest_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'RandomForest_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTrees_BAG_L1/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T4': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T5': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T6': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L1/T7': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L2': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'LightGBM_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'RandomForest_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'CatBoost_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True},
'ExtraTrees_BAG_L2/T1': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L2/T2': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'ExtraTrees_BAG_L2/T3': {'use_orig_features': True,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True,
'use_child_oof': True},
'WeightedEnsemble_L3': {'use_orig_features': False,
'max_base_models': 25,
'max_base_models_per_type': 5,
'save_bag_folds': True}},
'leaderboard': model score_val pred_time_val fit_time \
0 WeightedEnsemble_L2 -33.038009 0.001529 135.658681
1 WeightedEnsemble_L3 -33.158029 0.003626 385.054592
2 LightGBM_BAG_L2/T1 -33.484871 0.002360 292.803847
3 ExtraTrees_BAG_L2/T2 -33.565096 0.002403 283.678812
4 LightGBM_BAG_L1/T2 -33.609296 0.000131 33.872199
5 ExtraTrees_BAG_L2/T3 -33.730641 0.002363 277.433658
6 ExtraTrees_BAG_L2/T1 -33.756812 0.002407 275.210723
7 CatBoost_BAG_L2/T1 -33.821450 0.002318 317.492665
8 RandomForest_BAG_L2/T1 -33.991969 0.002394 292.045161
9 CatBoost_BAG_L1/T1 -36.826482 0.000135 71.848728
10 ExtraTrees_BAG_L1/T6 -37.282208 0.000216 16.457945
11 ExtraTrees_BAG_L1/T7 -37.387917 0.000124 14.057925
12 ExtraTrees_BAG_L1/T5 -37.424838 0.000138 11.915664
13 ExtraTrees_BAG_L1/T2 -37.427425 0.000157 10.959226
14 ExtraTrees_BAG_L1/T4 -37.506339 0.000182 9.332303
15 ExtraTrees_BAG_L1/T3 -37.648697 0.000166 6.704234
16 ExtraTrees_BAG_L1/T1 -37.917176 0.000160 5.374717
17 RandomForest_BAG_L1/T2 -38.254801 0.000247 21.524785
18 RandomForest_BAG_L1/T4 -38.329116 0.000154 17.215766
19 RandomForest_BAG_L1/T3 -38.341621 0.000166 12.632894
20 RandomForest_BAG_L1/T1 -38.490504 0.000159 9.875644
21 LightGBM_BAG_L1/T1 -39.654394 0.000092 25.342679
pred_time_val_marginal fit_time_marginal stack_level can_infer \
0 0.000880 0.846915 2 True
1 0.000834 0.378233 3 True
2 0.000132 25.689138 2 True
3 0.000175 16.564104 2 True
4 0.000131 33.872199 1 True
5 0.000135 10.318949 2 True
6 0.000179 8.096014 2 True
7 0.000090 50.377956 2 True
8 0.000166 24.930453 2 True
9 0.000135 71.848728 1 True
10 0.000216 16.457945 1 True
11 0.000124 14.057925 1 True
12 0.000138 11.915664 1 True
13 0.000157 10.959226 1 True
14 0.000182 9.332303 1 True
15 0.000166 6.704234 1 True
16 0.000160 5.374717 1 True
17 0.000247 21.524785 1 True
18 0.000154 17.215766 1 True
19 0.000166 12.632894 1 True
20 0.000159 9.875644 1 True
21 0.000092 25.342679 1 True
fit_order
0 15
1 22
2 16
3 20
4 2
5 21
6 19
7 18
8 17
9 7
10 13
11 14
12 12
13 9
14 11
15 10
16 8
17 4
18 6
19 5
20 3
21 1 }
predictor_hpo_3e.leaderboard(silent=True)
| model | score_val | pred_time_val | fit_time | pred_time_val_marginal | fit_time_marginal | stack_level | can_infer | fit_order | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | WeightedEnsemble_L2 | -33.038009 | 0.001529 | 135.658681 | 0.000880 | 0.846915 | 2 | True | 15 |
| 1 | WeightedEnsemble_L3 | -33.158029 | 0.003626 | 385.054592 | 0.000834 | 0.378233 | 3 | True | 22 |
| 2 | LightGBM_BAG_L2/T1 | -33.484871 | 0.002360 | 292.803847 | 0.000132 | 25.689138 | 2 | True | 16 |
| 3 | ExtraTrees_BAG_L2/T2 | -33.565096 | 0.002403 | 283.678812 | 0.000175 | 16.564104 | 2 | True | 20 |
| 4 | LightGBM_BAG_L1/T2 | -33.609296 | 0.000131 | 33.872199 | 0.000131 | 33.872199 | 1 | True | 2 |
| 5 | ExtraTrees_BAG_L2/T3 | -33.730641 | 0.002363 | 277.433658 | 0.000135 | 10.318949 | 2 | True | 21 |
| 6 | ExtraTrees_BAG_L2/T1 | -33.756812 | 0.002407 | 275.210723 | 0.000179 | 8.096014 | 2 | True | 19 |
| 7 | CatBoost_BAG_L2/T1 | -33.821450 | 0.002318 | 317.492665 | 0.000090 | 50.377956 | 2 | True | 18 |
| 8 | RandomForest_BAG_L2/T1 | -33.991969 | 0.002394 | 292.045161 | 0.000166 | 24.930453 | 2 | True | 17 |
| 9 | CatBoost_BAG_L1/T1 | -36.826482 | 0.000135 | 71.848728 | 0.000135 | 71.848728 | 1 | True | 7 |
| 10 | ExtraTrees_BAG_L1/T6 | -37.282208 | 0.000216 | 16.457945 | 0.000216 | 16.457945 | 1 | True | 13 |
| 11 | ExtraTrees_BAG_L1/T7 | -37.387917 | 0.000124 | 14.057925 | 0.000124 | 14.057925 | 1 | True | 14 |
| 12 | ExtraTrees_BAG_L1/T5 | -37.424838 | 0.000138 | 11.915664 | 0.000138 | 11.915664 | 1 | True | 12 |
| 13 | ExtraTrees_BAG_L1/T2 | -37.427425 | 0.000157 | 10.959226 | 0.000157 | 10.959226 | 1 | True | 9 |
| 14 | ExtraTrees_BAG_L1/T4 | -37.506339 | 0.000182 | 9.332303 | 0.000182 | 9.332303 | 1 | True | 11 |
| 15 | ExtraTrees_BAG_L1/T3 | -37.648697 | 0.000166 | 6.704234 | 0.000166 | 6.704234 | 1 | True | 10 |
| 16 | ExtraTrees_BAG_L1/T1 | -37.917176 | 0.000160 | 5.374717 | 0.000160 | 5.374717 | 1 | True | 8 |
| 17 | RandomForest_BAG_L1/T2 | -38.254801 | 0.000247 | 21.524785 | 0.000247 | 21.524785 | 1 | True | 4 |
| 18 | RandomForest_BAG_L1/T4 | -38.329116 | 0.000154 | 17.215766 | 0.000154 | 17.215766 | 1 | True | 6 |
| 19 | RandomForest_BAG_L1/T3 | -38.341621 | 0.000166 | 12.632894 | 0.000166 | 12.632894 | 1 | True | 5 |
| 20 | RandomForest_BAG_L1/T1 | -38.490504 | 0.000159 | 9.875644 | 0.000159 | 9.875644 | 1 | True | 3 |
| 21 | LightGBM_BAG_L1/T1 | -39.654394 | 0.000092 | 25.342679 | 0.000092 | 25.342679 | 1 | True | 1 |
fig = predictor_hpo_3e.leaderboard(silent=True).plot(kind="bar", x="model", y="score_val").figure
fig.savefig('img/exp_3e_leaderboard.png')
# Remember to set all negative values to zero
predictions_hpo_3e = predictor_hpo_3e.predict(test)
predictions_hpo_3e.describe()
count 6493.000000 mean 192.521133 std 173.963516 min -9.803270 25% 47.315571 50% 151.798645 75% 285.074310 max 891.959351 Name: count, dtype: float64
(predictions_hpo_3e<0).sum()
59
predictions_hpo_3e = predictions_hpo_3e.apply(lambda x: 0 if x<0 else x)
predictions_hpo_3e.describe()
count 6493.000000 mean 192.541119 std 173.941145 min 0.000000 25% 47.315571 50% 151.798645 75% 285.074310 max 891.959351 Name: count, dtype: float64
submission_new_hpo_3e = pd.read_csv("sampleSubmission.csv")
# Same submitting predictions
submission_new_hpo_3e["count"] = predictions_hpo_3e
submission_new_hpo_3e.to_csv("submission_new_hpo_3e.csv", index=False)
!kaggle competitions submit -c bike-sharing-demand -f submission_new_hpo_3e.csv -m "hp tuning 3e"
100%|█████████████████████████████████████████| 242k/242k [00:00<00:00, 487kB/s] Successfully submitted to Bike Sharing Demand
!kaggle competitions submissions -c bike-sharing-demand | tail -n +1 | head -n 10
fileName date description status publicScore privateScore
------------------------------ ------------------- ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- -------- ----------- ------------
submission_new_hpo_3e.csv 2022-12-26 13:59:46 hp tuning 3e complete 0.49307 0.49307
submission_new_hpo_3d.csv 2022-12-26 13:50:52 hp tuning 3d complete 0.52253 0.52253
submission_new_hpo_3c.csv 2022-12-26 13:46:38 hpo 3c num_bag_sets = 5 complete 0.62247 0.62247
submission_new_hpo_3b.csv 2022-12-26 13:35:38 hpo 3b num_bag_folds = 10 complete 0.63100 0.63100
submission_new_hpo_3a.csv 2022-12-26 13:24:35 hpo 3a num_stack_levels = 2 complete 0.66835 0.66835
submission_new_features_2b.csv 2022-12-26 13:13:10 new features 2b complete 0.65357 0.65357
submission_new_features_2a.csv 2022-12-26 12:35:34 new features 2a complete 0.62078 0.62078
submission.csv 2022-12-26 12:10:34 initial submission 1 complete 1.79067 1.79067
tail: error writing 'standard output': Broken pipe
Traceback (most recent call last):
File "/usr/local/bin/kaggle", line 8, in <module>
sys.exit(main())
File "/usr/local/lib/python3.7/site-packages/kaggle/cli.py", line 67, in main
out = args.func(**command_args)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 618, in competition_submissions_cli
self.print_table(submissions, fields)
File "/usr/local/lib/python3.7/site-packages/kaggle/api/kaggle_api_extended.py", line 2253, in print_table
print(row_format.format(*i_fields))
BrokenPipeError: [Errno 32] Broken pipe
# Taking the top model score from each training run and creating a line plot to show improvement
# You can create these in the notebook and save them to PNG or use some other tool (e.g. google sheets, excel)
fig = pd.DataFrame(
{
"model": ["initial", "add_features", "hpo"],
"score": [?, ?, ?]
}
).plot(x="model", y="score", figsize=(8, 6)).get_figure()
fig.savefig('model_train_score.png')
# Take the 3 kaggle scores and creating a line plot to show improvement
fig = pd.DataFrame(
{
"test_eval": ["initial_1", "add_features_2a", "add_features_2b", "hpo_3a", "hpo_3b", "hpo_3c","hpo_3d","hpo_3e"],
"score": [1.79033, 0.62078, 0.65357, 0.66835, 0.63100, 0.62247, 0.52253, 0.49307 ]
}
).plot(x="test_eval", y="score", figsize=(8, 6)).get_figure()
plt.xticks(rotation=45)
plt.grid(alpha=0.3)
fig.savefig('img/model_test_score.png')
# The 3 hyperparameters we tuned with the kaggle score as the result
hpo_table = pd.DataFrame({
"model": ["initial_1", "add_features_2a", "add_features_2b", "hpo_3a", "hpo_3b", "hpo_3c","hpo_3d","hpo_3e"],
"num_stack_levels": [1,1,1, 2, 1, 1, None,1],
"num_bag_folds": [8, 8, 8, 8,10,8,None,8],
"num_bag_sets": [20, 20, 20, 20,20,5,None,5],
"models_hpo": ['default', "default", "default","default","default","default","default","optimized"],
"score": [1.79033, 0.62078, 0.65357, 0.66835, 0.63100, 0.62247, 0.52253, 0.49307 ]
})
hpo_table
| model | num_stack_levels | num_bag_folds | num_bag_sets | models_hpo | score | |
|---|---|---|---|---|---|---|
| 0 | initial_1 | 1.0 | 8.0 | 20.0 | default | 1.79033 |
| 1 | add_features_2a | 1.0 | 8.0 | 20.0 | default | 0.62078 |
| 2 | add_features_2b | 1.0 | 8.0 | 20.0 | default | 0.65357 |
| 3 | hpo_3a | 2.0 | 8.0 | 20.0 | default | 0.66835 |
| 4 | hpo_3b | 1.0 | 10.0 | 20.0 | default | 0.63100 |
| 5 | hpo_3c | 1.0 | 8.0 | 5.0 | default | 0.62247 |
| 6 | hpo_3d | NaN | NaN | NaN | default | 0.52253 |
| 7 | hpo_3e | 1.0 | 8.0 | 5.0 | optimized | 0.49307 |